Deployment Guide

Full installation of LiteMaaS on OpenShift using the rhpds.litemaas Ansible collection.

Prerequisites

Before running the deployment playbook, verify these requirements are met.

Cluster Requirements

Red Hat OpenShift 4.12 or later
Cluster admin privileges (or namespace admin with the ability to create Routes and OAuthClients)
Sufficient compute capacity: minimum 4 CPU cores and 8 GB RAM allocatable for the LiteMaaS components alone
A default StorageClass that supports ReadWriteOnce PVCs (or specify one explicitly)
Outbound internet access to pull images from quay.io and registry.redhat.io

Local Workstation Requirements

Ansible 2.14+ with kubernetes.core collection
oc CLI authenticated to the target cluster
python3 with kubernetes and openshift pip packages

# Install Python dependencies
pip install kubernetes openshift

# Install required Ansible collections
ansible-galaxy collection install kubernetes.core community.general

Image Registry Access

The deployment pulls images from two registries. Verify cluster access before starting:

Registry	Images	Auth Required
`quay.io/rh-aiservices-bu`	litellm-non-root, litemaas-backend, litemaas-frontend	Public pull
`registry.redhat.io/rhel9`	redis-7 (primary), falls back to quay.io/sclorg	Red Hat credentials

The HA deployment task automatically tests image accessibility and falls back to an alternative image if the primary cannot be pulled. The fallback for Redis is quay.io/sclorg/redis-7-c9s:latest.

Collection Installation

Collection Installation vs. Deployment: These are two separate steps. Collection Installation puts the Ansible roles on your workstation (done once). Deployment (next section) runs the playbook to actually install LiteMaaS on your OpenShift cluster. You need to do both.

LiteMaaS is distributed as an Ansible collection. Install it on your workstation using one of the options below.

Option A: From GitHub (Recommended for RHDP)

# Clone the collection repository
git clone https://github.com/rhpds/rhpds.litemaas.git ~/work/code/rhpds.litemaas
cd ~/work/code/rhpds.litemaas

# Install the collection locally
ansible-galaxy collection install . --force

Option B: In AgnosticV (RHDP Catalog Items)

For catalog-deployed workloads, add the collection to your AgnosticV common.yaml:

# In agnosticv/config/catalog-item/common.yaml
requirements_collections:
  - name: rhpds.litemaas
    source: https://github.com/rhpds/rhpds.litemaas.git
    type: git
    version: main

workloads:
  - rhpds.litemaas.ocp4_workload_litemaas

Option C: Helm Chart (Kubernetes / OpenShift, no Ansible)

# Add the chart repository (if published) or install from local path
helm install litemaas ~/work/code/rh-litemaas/deployment/helm/litemaas/ \
  --namespace litemaas \
  --create-namespace \
  -f my-values.yaml

Deployment

Step 1: Configure Variables

Create an extra vars file. At minimum you need to set the namespace and decide whether to enable OAuth.

# litemaas-vars.yml
---
ocp4_workload_litemaas_namespace: "litemaas"

# OAuth login (recommended for production)
ocp4_workload_litemaas_oauth_enabled: true

# Scale LiteLLM replicas (default 3)
ocp4_workload_litemaas_ha_litellm_replicas: 3

# Route prefix customization (optional)
ocp4_workload_litemaas_api_route_prefix: "litellm-prod"
ocp4_workload_litemaas_frontend_route_prefix: "litellm-prod-frontend"
ocp4_workload_litemaas_admin_route_prefix: "litellm-prod-admin"

# Pre-configure models (optional — can also be done via UI after deploy)
ocp4_workload_litemaas_litellm_models:
  - model_name: "granite-3-2-8b-instruct"
    litellm_model: "openai/granite-3-2-8b-instruct"
    api_base: "http://granite-3-2-8b-instruct-predictor.llm-hosting.svc.cluster.local/v1"
    api_key: "sk-placeholder"
    rpm: 120
    tpm: 200000

Step 2: Run the Playbook

# Deploy LiteMaaS in HA mode
ansible-playbook playbooks/deploy_litemaas_ha.yml \
  -e @litemaas-vars.yml

# Or pass variables inline
ansible-playbook playbooks/deploy_litemaas_ha.yml \
  -e ocp4_workload_litemaas_namespace=litemaas \
  -e ocp4_workload_litemaas_oauth_enabled=true

Step 3: Deployment Order

The HA task deploys components in this sequence. Each component waits for its predecessor to be ready before continuing:

graph LR A[Namespace] --> B[PostgreSQL 16
StatefulSet + PVC] B --> C[Redis 7
Deployment] C --> D[LiteLLM Proxy
3 replicas] D --> E[LiteMaaS Backend
3 replicas] E --> F[Branding ConfigMaps
if enabled] F --> G[LiteMaaS Frontend
3 replicas] G --> H[Routes + OAuthClient] style A fill:#e8e8e8 style D fill:#fef0f0,stroke:#cc0000 style E fill:#f0f4ff,stroke:#0066cc style G fill:#f0faf2,stroke:#3e8635

Step 4: Post-Deploy Output

After successful deployment, the playbook prints connection details via agnosticd_user_info:

# Example output
LiteLLM Admin Portal: https://litellm-prod.apps.cluster.example.com
LiteLLM Admin Login: admin / <auto-generated-password>
LiteLLM Master API Key: sk-<auto-generated-key>
LiteMaaS User Interface: https://litellm-prod-frontend.apps.cluster.example.com

Complete Variable Reference

All variables with their defaults from roles/ocp4_workload_litemaas/defaults/main.yml.

Core Settings

Variable	Default	Required	Description
`ocp4_workload_litemaas_namespace`	`litemaas`	optional	OpenShift namespace to deploy into
`ocp4_workload_litemaas_version`	`0.2.2`	optional	Collection version tag (metadata only)
`ocp4_workload_litemaas_remove`	`false`	optional	Set to `true` to uninstall (deletes namespace)
`ocp4_workload_litemaas_cluster_domain`	`""`	optional	Cluster apps domain (auto-detected from cluster if empty)

Image Configuration

Variable	Default	Description
`ocp4_workload_litemaas_litellm_image`	`quay.io/rh-aiservices-bu/litellm-non-root`	LiteLLM proxy image repository
`ocp4_workload_litemaas_litellm_tag`	`main-v1.81.0-stable`	LiteLLM image tag
`ocp4_workload_litemaas_backend_image`	`quay.io/rhpds/litemaas`	LiteMaaS backend image repository
`ocp4_workload_litemaas_backend_tag`	`backend-0.2.2`	Backend image tag
`ocp4_workload_litemaas_frontend_image`	`quay.io/rhpds/litemaas`	LiteMaaS frontend image repository
`ocp4_workload_litemaas_frontend_tag`	`frontend-0.2.2`	Frontend image tag

Production images: The running production deployment uses quay.io/rh-aiservices-bu/litellm-non-root:main-v1.81.0-stable-custom for LiteLLM and quay.io/rh-aiservices-bu/litemaas-backend:0.4.0 / quay.io/rh-aiservices-bu/litemaas-frontend:0.4.0 for LiteMaaS. The production image registry is quay.io/rh-aiservices-bu, not the default quay.io/rhpds.

Route Configuration

Variable	Default	Description
`ocp4_workload_litemaas_api_route_prefix`	`litellm`	Subdomain prefix for the LiteLLM API route
`ocp4_workload_litemaas_admin_route_prefix`	`litellm-admin`	Subdomain prefix for the admin/backend route
`ocp4_workload_litemaas_frontend_route_prefix`	`litellm-frontend`	Subdomain prefix for the frontend UI route

Credentials (Auto-Generated)

Variable	Default	Description
`ocp4_workload_litemaas_postgres_password`	random 32-char	PostgreSQL database password
`ocp4_workload_litemaas_jwt_secret`	random 32-char	JWT signing secret for LiteMaaS backend sessions
`ocp4_workload_litemaas_admin_api_key`	random 32-char	LiteMaaS admin API key (stored in `backend-secret`)
`ocp4_workload_litemaas_litellm_api_key`	sk- + random 32-char	LiteLLM master key (stored in `litellm-secret`)
`ocp4_workload_litemaas_litellm_ui_username`	`admin`	LiteLLM admin UI username
`ocp4_workload_litemaas_litellm_ui_password`	random 16-char	LiteLLM admin UI password

Security: Never commit actual credentials to version control. The Ansible role generates secure random values at deploy time. Retrieve them after deployment from the OpenShift secrets, not from code.

PostgreSQL Configuration

Variable	Default	Description
`ocp4_workload_litemaas_postgres_version`	`16`	PostgreSQL major version
`ocp4_workload_litemaas_postgres_storage_size`	`10Gi`	PVC size for PostgreSQL data
`ocp4_workload_litemaas_postgres_storage_class`	`""`	StorageClass (empty = auto-detect per cloud)
`ocp4_workload_litemaas_postgres_storage_access_mode`	`ReadWriteOnce`	PVC access mode

HA-Specific Variables

Variable	Default	Description
`ocp4_workload_litemaas_ha_litellm_replicas`	`3`	Number of LiteLLM proxy replicas
`ocp4_workload_litemaas_ha_enable_redis`	`true`	Deploy Redis for session/key caching
`ocp4_workload_litemaas_ha_enable_postgres`	`true`	Deploy PostgreSQL (set false to use external DB)
`ocp4_workload_litemaas_ha_redis_image`	`registry.redhat.io/rhel9/redis-7:latest`	Primary Redis image
`ocp4_workload_litemaas_ha_redis_image_fallback`	`quay.io/sclorg/redis-7-c9s:latest`	Fallback Redis image
`ocp4_workload_litemaas_ha_postgres_pvc_size`	`10Gi`	PVC size for HA PostgreSQL
`ocp4_workload_litemaas_ha_litellm_memory_limit`	`2Gi`	Per-pod memory limit for LiteLLM
`ocp4_workload_litemaas_ha_litellm_cpu_limit`	`2000m`	Per-pod CPU limit for LiteLLM

OAuth Configuration

Variable	Default	Description
`ocp4_workload_litemaas_oauth_enabled`	`false`	Enable OpenShift OAuth login
`ocp4_workload_litemaas_oauth_provider`	`openshift`	OAuth provider type (`openshift` or `oidc`)
`ocp4_workload_litemaas_oauth_client_id`	namespace name	OAuthClient resource name
`ocp4_workload_litemaas_oauth_client_secret`	random 32-char	OAuthClient secret
`ocp4_workload_litemaas_admin_emails`	`[admin]`	Initial admin user list

User Budget & Rate Limits

These control the defaults applied to every new user on first OAuth login. Existing users are not affected by changes — update them via the admin UI or API.

Variable	Default	Description
`ocp4_workload_litemaas_default_user_max_budget`	`2000`	Default spend limit per user (in USD)
`ocp4_workload_litemaas_default_user_budget_duration`	`daily`	Budget reset period: `daily`, `weekly`, `monthly`, or leave empty for lifetime
`ocp4_workload_litemaas_default_user_tpm_limit`	`400000`	Default tokens-per-minute limit per user
`ocp4_workload_litemaas_default_user_rpm_limit`	`500`	Default requests-per-minute limit per user

Resource Limits

Variable	Default	Description
`ocp4_workload_litemaas_litellm_memory_limit`	`1Gi`	LiteLLM memory limit (single-replica mode)
`ocp4_workload_litemaas_litellm_cpu_limit`	`1000m`	LiteLLM CPU limit
`ocp4_workload_litemaas_backend_memory_limit`	`512Mi`	Backend memory limit
`ocp4_workload_litemaas_backend_cpu_limit`	`500m`	Backend CPU limit
`ocp4_workload_litemaas_frontend_memory_limit`	`256Mi`	Frontend memory limit
`ocp4_workload_litemaas_frontend_cpu_limit`	`250m`	Frontend CPU limit

Cloud-Specific Storage Classes

When ocp4_workload_litemaas_postgres_storage_class is empty, the role auto-detects the cloud provider and selects the appropriate StorageClass:

Cloud Provider	Storage Class
AWS	`gp3-csi`
Azure	`managed-premium`
GCP	`standard-rwo`
vSphere	`thin-csi`
OpenStack	`standard`
Bare Metal	`ocs-external-storagecluster-ceph-rbd`

LiteLLM Database Migration

LiteLLM uses Prisma ORM and runs database migrations automatically on startup. However, when upgrading between LiteLLM versions, you may need to trigger a migration manually if automatic migration fails or is skipped.

Check Migration Status

# Check LiteLLM logs for migration output
oc logs -n litellm-rhpds deployment/litellm --tail=100 | grep -i migrat

# Check if database tables exist
oc exec -n litellm-rhpds litellm-postgres-0 -- \
  psql -U litellm -d litellm -c "\dt"

Run Migration Manually

# Get the DATABASE_URL from the running LiteLLM deployment
DB_URL=$(oc exec -n litellm-rhpds deployment/litellm -- \
  sh -c 'echo $DATABASE_URL')

# Exec into a LiteLLM pod and run migration
LITELLM_POD=$(oc get pods -n litellm-rhpds -l app=litellm \
  -o jsonpath='{.items[0].metadata.name}')

oc exec -n litellm-rhpds $LITELLM_POD -- \
  python -c "from litellm.proxy.proxy_server import prisma_client; prisma_client.db.migrate.deploy()"

# Alternative: restart the deployment (migration runs on startup)
oc rollout restart deployment/litellm -n litellm-rhpds
oc rollout status deployment/litellm -n litellm-rhpds

Caution: Always take a database backup before running migrations between major LiteLLM versions. See the Operations page for upgrade procedures.

Post-Install Checklist

After deployment completes, verify the following before announcing the service to users.

All pods are Running: oc get pods -n <namespace> — expect litellm (3), litellm-backend (3), litellm-frontend (3), litellm-postgres-0, litellm-redis-*
All routes resolve: test the API, frontend, and admin routes in a browser
LiteLLM health check passes: curl https://<api-route>/health/livenessz
LiteLLM admin UI accessible: log in with the generated credentials at the API route URL
LiteMaaS frontend loads: visit the frontend route, verify it shows the login page
OAuth login works (if enabled): click Login, authenticate with OpenShift, verify redirect works
At least one model is configured: use the LiteLLM admin UI to add and test a model
Model is visible in LiteMaaS frontend: check that models appear in the catalog after sync
Test subscription flow: as a non-admin user, subscribe to a model and retrieve an API key
Test API key: make a curl request with the issued virtual key, verify response
Retrieve and save admin credentials: oc get secret litellm-secret -n <ns> -o yaml
Promote at least one user to admin: oc adm groups add-users litemaas-admins user@redhat.com, then have them log in via the frontend
Set up key cleanup cron (production): ./setup-key-cleanup-cronjob.sh <namespace>
Configure backup cron (production): ./setup-litemaas-backup-cronjob.sh <namespace> <s3-bucket>

Adding Admin Users

LiteMaaS determines admin access by checking OpenShift group membership at OAuth login time. Users in the litemaas-admins OCP Group are granted the admin role automatically when they log in.

Step 1 — Add the user to the litemaas-admins group

# Add a user to the litemaas-admins OCP group
oc adm groups add-users litemaas-admins user@redhat.com

# Verify group membership
oc get group litemaas-admins -o yaml

If the litemaas-admins group does not exist yet, create it first: oc adm groups new litemaas-admins

Step 2 — User logs in via the frontend

Have the user visit the LiteMaaS frontend and log in with their OpenShift credentials. The backend reads their group membership from OCP during the OAuth callback and assigns the admin role automatically. No further action is needed.

Role hierarchy: admin — full platform control | adminReadonly — read-only access to all admin views | user — default role on first login.

Resource Summary (Standard HA Deployment)

Component	Replicas	CPU Request	Memory Request	Storage
LiteLLM proxy	3	3 × 500m = 1500m	3 × 512Mi = 1.5Gi	—
LiteMaaS backend	3	3 × 100m = 300m	3 × 256Mi = 768Mi	—
LiteMaaS frontend	3	3 × 50m = 150m	3 × 128Mi = 384Mi	—
PostgreSQL 16	1 (StatefulSet)	500m	512Mi	10Gi RWO
Redis 7	1	200m	256Mi	—
Total		~2650m	~3.4Gi	10Gi

Removal / Uninstall

# Remove via playbook (cleanly removes OAuthClient too)
ansible-playbook playbooks/deploy_litemaas_ha.yml \
  -e ocp4_workload_litemaas_namespace=litemaas \
  -e ocp4_workload_litemaas_remove=true

# Or delete the namespace directly (OAuthClient persists — clean it up separately)
oc delete namespace litemaas
oc delete oauthclient litemaas

Deleting the namespace also deletes the PostgreSQL PVC and all data. Take a backup first if you need to preserve data.

← Previous Models Reference Next → Day-2 Operations