Model as a Service for Red Hat Demo Platform
Full installation of LiteMaaS on OpenShift using the rhpds.litemaas Ansible collection.
Before running the deployment playbook, verify these requirements are met.
quay.io and registry.redhat.iokubernetes.core collectionoc CLI authenticated to the target clusterpython3 with kubernetes and openshift pip packages# Install Python dependencies pip install kubernetes openshift # Install required Ansible collections ansible-galaxy collection install kubernetes.core community.general
The deployment pulls images from two registries. Verify cluster access before starting:
| Registry | Images | Auth Required |
|---|---|---|
quay.io/rh-aiservices-bu | litellm-non-root, litemaas-backend, litemaas-frontend | Public pull |
registry.redhat.io/rhel9 | redis-7 (primary), falls back to quay.io/sclorg | Red Hat credentials |
The HA deployment task automatically tests image accessibility and falls back to an alternative image if the primary cannot be pulled. The fallback for Redis is quay.io/sclorg/redis-7-c9s:latest.
Collection Installation vs. Deployment: These are two separate steps. Collection Installation puts the Ansible roles on your workstation (done once). Deployment (next section) runs the playbook to actually install LiteMaaS on your OpenShift cluster. You need to do both.
LiteMaaS is distributed as an Ansible collection. Install it on your workstation using one of the options below.
# Clone the collection repository git clone https://github.com/rhpds/rhpds.litemaas.git ~/work/code/rhpds.litemaas cd ~/work/code/rhpds.litemaas # Install the collection locally ansible-galaxy collection install . --force
For catalog-deployed workloads, add the collection to your AgnosticV common.yaml:
# In agnosticv/config/catalog-item/common.yaml requirements_collections: - name: rhpds.litemaas source: https://github.com/rhpds/rhpds.litemaas.git type: git version: main workloads: - rhpds.litemaas.ocp4_workload_litemaas
# Add the chart repository (if published) or install from local path
helm install litemaas ~/work/code/rh-litemaas/deployment/helm/litemaas/ \
--namespace litemaas \
--create-namespace \
-f my-values.yaml
Create an extra vars file. At minimum you need to set the namespace and decide whether to enable OAuth.
# litemaas-vars.yml --- ocp4_workload_litemaas_namespace: "litemaas" # OAuth login (recommended for production) ocp4_workload_litemaas_oauth_enabled: true # Scale LiteLLM replicas (default 3) ocp4_workload_litemaas_ha_litellm_replicas: 3 # Route prefix customization (optional) ocp4_workload_litemaas_api_route_prefix: "litellm-prod" ocp4_workload_litemaas_frontend_route_prefix: "litellm-prod-frontend" ocp4_workload_litemaas_admin_route_prefix: "litellm-prod-admin" # Pre-configure models (optional — can also be done via UI after deploy) ocp4_workload_litemaas_litellm_models: - model_name: "granite-3-2-8b-instruct" litellm_model: "openai/granite-3-2-8b-instruct" api_base: "http://granite-3-2-8b-instruct-predictor.llm-hosting.svc.cluster.local/v1" api_key: "sk-placeholder" rpm: 120 tpm: 200000
# Deploy LiteMaaS in HA mode ansible-playbook playbooks/deploy_litemaas_ha.yml \ -e @litemaas-vars.yml # Or pass variables inline ansible-playbook playbooks/deploy_litemaas_ha.yml \ -e ocp4_workload_litemaas_namespace=litemaas \ -e ocp4_workload_litemaas_oauth_enabled=true
The HA task deploys components in this sequence. Each component waits for its predecessor to be ready before continuing:
After successful deployment, the playbook prints connection details via agnosticd_user_info:
# Example output
LiteLLM Admin Portal: https://litellm-prod.apps.cluster.example.com
LiteLLM Admin Login: admin / <auto-generated-password>
LiteLLM Master API Key: sk-<auto-generated-key>
LiteMaaS User Interface: https://litellm-prod-frontend.apps.cluster.example.com
All variables with their defaults from roles/ocp4_workload_litemaas/defaults/main.yml.
| Variable | Default | Required | Description |
|---|---|---|---|
ocp4_workload_litemaas_namespace | litemaas | optional | OpenShift namespace to deploy into |
ocp4_workload_litemaas_version | 0.2.2 | optional | Collection version tag (metadata only) |
ocp4_workload_litemaas_remove | false | optional | Set to true to uninstall (deletes namespace) |
ocp4_workload_litemaas_cluster_domain | "" | optional | Cluster apps domain (auto-detected from cluster if empty) |
| Variable | Default | Description |
|---|---|---|
ocp4_workload_litemaas_litellm_image | quay.io/rh-aiservices-bu/litellm-non-root | LiteLLM proxy image repository |
ocp4_workload_litemaas_litellm_tag | main-v1.81.0-stable | LiteLLM image tag |
ocp4_workload_litemaas_backend_image | quay.io/rhpds/litemaas | LiteMaaS backend image repository |
ocp4_workload_litemaas_backend_tag | backend-0.2.2 | Backend image tag |
ocp4_workload_litemaas_frontend_image | quay.io/rhpds/litemaas | LiteMaaS frontend image repository |
ocp4_workload_litemaas_frontend_tag | frontend-0.2.2 | Frontend image tag |
Production images: The running production deployment uses
quay.io/rh-aiservices-bu/litellm-non-root:main-v1.81.0-stable-custom for LiteLLM and
quay.io/rh-aiservices-bu/litemaas-backend:0.4.0 /
quay.io/rh-aiservices-bu/litemaas-frontend:0.4.0 for LiteMaaS.
The production image registry is quay.io/rh-aiservices-bu, not the default quay.io/rhpds.
| Variable | Default | Description |
|---|---|---|
ocp4_workload_litemaas_api_route_prefix | litellm | Subdomain prefix for the LiteLLM API route |
ocp4_workload_litemaas_admin_route_prefix | litellm-admin | Subdomain prefix for the admin/backend route |
ocp4_workload_litemaas_frontend_route_prefix | litellm-frontend | Subdomain prefix for the frontend UI route |
| Variable | Default | Description |
|---|---|---|
ocp4_workload_litemaas_postgres_password | random 32-char | PostgreSQL database password |
ocp4_workload_litemaas_jwt_secret | random 32-char | JWT signing secret for LiteMaaS backend sessions |
ocp4_workload_litemaas_admin_api_key | random 32-char | LiteMaaS admin API key (stored in backend-secret) |
ocp4_workload_litemaas_litellm_api_key | sk- + random 32-char | LiteLLM master key (stored in litellm-secret) |
ocp4_workload_litemaas_litellm_ui_username | admin | LiteLLM admin UI username |
ocp4_workload_litemaas_litellm_ui_password | random 16-char | LiteLLM admin UI password |
Security: Never commit actual credentials to version control. The Ansible role generates secure random values at deploy time. Retrieve them after deployment from the OpenShift secrets, not from code.
| Variable | Default | Description |
|---|---|---|
ocp4_workload_litemaas_postgres_version | 16 | PostgreSQL major version |
ocp4_workload_litemaas_postgres_storage_size | 10Gi | PVC size for PostgreSQL data |
ocp4_workload_litemaas_postgres_storage_class | "" | StorageClass (empty = auto-detect per cloud) |
ocp4_workload_litemaas_postgres_storage_access_mode | ReadWriteOnce | PVC access mode |
| Variable | Default | Description |
|---|---|---|
ocp4_workload_litemaas_ha_litellm_replicas | 3 | Number of LiteLLM proxy replicas |
ocp4_workload_litemaas_ha_enable_redis | true | Deploy Redis for session/key caching |
ocp4_workload_litemaas_ha_enable_postgres | true | Deploy PostgreSQL (set false to use external DB) |
ocp4_workload_litemaas_ha_redis_image | registry.redhat.io/rhel9/redis-7:latest | Primary Redis image |
ocp4_workload_litemaas_ha_redis_image_fallback | quay.io/sclorg/redis-7-c9s:latest | Fallback Redis image |
ocp4_workload_litemaas_ha_postgres_pvc_size | 10Gi | PVC size for HA PostgreSQL |
ocp4_workload_litemaas_ha_litellm_memory_limit | 2Gi | Per-pod memory limit for LiteLLM |
ocp4_workload_litemaas_ha_litellm_cpu_limit | 2000m | Per-pod CPU limit for LiteLLM |
| Variable | Default | Description |
|---|---|---|
ocp4_workload_litemaas_oauth_enabled | false | Enable OpenShift OAuth login |
ocp4_workload_litemaas_oauth_provider | openshift | OAuth provider type (openshift or oidc) |
ocp4_workload_litemaas_oauth_client_id | namespace name | OAuthClient resource name |
ocp4_workload_litemaas_oauth_client_secret | random 32-char | OAuthClient secret |
ocp4_workload_litemaas_admin_emails | [admin] | Initial admin user list |
These control the defaults applied to every new user on first OAuth login. Existing users are not affected by changes — update them via the admin UI or API.
| Variable | Default | Description |
|---|---|---|
ocp4_workload_litemaas_default_user_max_budget | 2000 | Default spend limit per user (in USD) |
ocp4_workload_litemaas_default_user_budget_duration | daily | Budget reset period: daily, weekly, monthly, or leave empty for lifetime |
ocp4_workload_litemaas_default_user_tpm_limit | 400000 | Default tokens-per-minute limit per user |
ocp4_workload_litemaas_default_user_rpm_limit | 500 | Default requests-per-minute limit per user |
| Variable | Default | Description |
|---|---|---|
ocp4_workload_litemaas_litellm_memory_limit | 1Gi | LiteLLM memory limit (single-replica mode) |
ocp4_workload_litemaas_litellm_cpu_limit | 1000m | LiteLLM CPU limit |
ocp4_workload_litemaas_backend_memory_limit | 512Mi | Backend memory limit |
ocp4_workload_litemaas_backend_cpu_limit | 500m | Backend CPU limit |
ocp4_workload_litemaas_frontend_memory_limit | 256Mi | Frontend memory limit |
ocp4_workload_litemaas_frontend_cpu_limit | 250m | Frontend CPU limit |
When ocp4_workload_litemaas_postgres_storage_class is empty, the role auto-detects the cloud provider and selects the appropriate StorageClass:
| Cloud Provider | Storage Class |
|---|---|
| AWS | gp3-csi |
| Azure | managed-premium |
| GCP | standard-rwo |
| vSphere | thin-csi |
| OpenStack | standard |
| Bare Metal | ocs-external-storagecluster-ceph-rbd |
LiteLLM uses Prisma ORM and runs database migrations automatically on startup. However, when upgrading between LiteLLM versions, you may need to trigger a migration manually if automatic migration fails or is skipped.
# Check LiteLLM logs for migration output oc logs -n litellm-rhpds deployment/litellm --tail=100 | grep -i migrat # Check if database tables exist oc exec -n litellm-rhpds litellm-postgres-0 -- \ psql -U litellm -d litellm -c "\dt"
# Get the DATABASE_URL from the running LiteLLM deployment DB_URL=$(oc exec -n litellm-rhpds deployment/litellm -- \ sh -c 'echo $DATABASE_URL') # Exec into a LiteLLM pod and run migration LITELLM_POD=$(oc get pods -n litellm-rhpds -l app=litellm \ -o jsonpath='{.items[0].metadata.name}') oc exec -n litellm-rhpds $LITELLM_POD -- \ python -c "from litellm.proxy.proxy_server import prisma_client; prisma_client.db.migrate.deploy()" # Alternative: restart the deployment (migration runs on startup) oc rollout restart deployment/litellm -n litellm-rhpds oc rollout status deployment/litellm -n litellm-rhpds
Caution: Always take a database backup before running migrations between major LiteLLM versions. See the Operations page for upgrade procedures.
After deployment completes, verify the following before announcing the service to users.
oc get pods -n <namespace> — expect litellm (3), litellm-backend (3), litellm-frontend (3), litellm-postgres-0, litellm-redis-*curl https://<api-route>/health/livenesszoc get secret litellm-secret -n <ns> -o yamloc adm groups add-users litemaas-admins user@redhat.com, then have them log in via the frontend./setup-key-cleanup-cronjob.sh <namespace>./setup-litemaas-backup-cronjob.sh <namespace> <s3-bucket>LiteMaaS determines admin access by checking OpenShift group membership at OAuth login time. Users in the litemaas-admins OCP Group are granted the admin role automatically when they log in.
# Add a user to the litemaas-admins OCP group oc adm groups add-users litemaas-admins user@redhat.com # Verify group membership oc get group litemaas-admins -o yaml
If the litemaas-admins group does not exist yet, create it first: oc adm groups new litemaas-admins
Have the user visit the LiteMaaS frontend and log in with their OpenShift credentials. The backend reads their group membership from OCP during the OAuth callback and assigns the admin role automatically. No further action is needed.
Role hierarchy: admin — full platform control | adminReadonly — read-only access to all admin views | user — default role on first login.
| Component | Replicas | CPU Request | Memory Request | Storage |
|---|---|---|---|---|
| LiteLLM proxy | 3 | 3 × 500m = 1500m | 3 × 512Mi = 1.5Gi | — |
| LiteMaaS backend | 3 | 3 × 100m = 300m | 3 × 256Mi = 768Mi | — |
| LiteMaaS frontend | 3 | 3 × 50m = 150m | 3 × 128Mi = 384Mi | — |
| PostgreSQL 16 | 1 (StatefulSet) | 500m | 512Mi | 10Gi RWO |
| Redis 7 | 1 | 200m | 256Mi | — |
| Total | ~2650m | ~3.4Gi | 10Gi |
# Remove via playbook (cleanly removes OAuthClient too) ansible-playbook playbooks/deploy_litemaas_ha.yml \ -e ocp4_workload_litemaas_namespace=litemaas \ -e ocp4_workload_litemaas_remove=true # Or delete the namespace directly (OAuthClient persists — clean it up separately) oc delete namespace litemaas oc delete oauthclient litemaas
Deleting the namespace also deletes the PostgreSQL PVC and all data. Take a backup first if you need to preserve data.