RHDP LiteMaaS

Model as a Service for Red Hat Demo Platform

Deployment Guide

Full installation of LiteMaaS on OpenShift using the rhpds.litemaas Ansible collection.

Prerequisites

Before running the deployment playbook, verify these requirements are met.

Cluster Requirements

Local Workstation Requirements

# Install Python dependencies
pip install kubernetes openshift

# Install required Ansible collections
ansible-galaxy collection install kubernetes.core community.general

Image Registry Access

The deployment pulls images from two registries. Verify cluster access before starting:

RegistryImagesAuth Required
quay.io/rh-aiservices-bulitellm-non-root, litemaas-backend, litemaas-frontendPublic pull
registry.redhat.io/rhel9redis-7 (primary), falls back to quay.io/sclorgRed Hat credentials

The HA deployment task automatically tests image accessibility and falls back to an alternative image if the primary cannot be pulled. The fallback for Redis is quay.io/sclorg/redis-7-c9s:latest.

Collection Installation

LiteMaaS is distributed as an Ansible collection. Install it from the repository or via Ansible Galaxy.

Option A: From GitHub (Recommended for RHDP)

# Clone the collection repository
git clone https://github.com/rhpds/rhpds.litemaas.git ~/work/code/rhpds.litemaas
cd ~/work/code/rhpds.litemaas

# Install the collection locally
ansible-galaxy collection install . --force

Option B: In AgnosticV (RHDP Catalog Items)

For catalog-deployed workloads, add the collection to your AgnosticV common.yaml:

# In agnosticv/config/catalog-item/common.yaml
requirements_collections:
  - name: rhpds.litemaas
    source: https://github.com/rhpds/rhpds.litemaas.git
    type: git
    version: main

workloads:
  - rhpds.litemaas.ocp4_workload_litemaas

Option C: Helm Chart (Kubernetes / OpenShift, no Ansible)

# Add the chart repository (if published) or install from local path
helm install litemaas ~/work/code/rh-litemaas/deployment/helm/litemaas/ \
  --namespace litemaas \
  --create-namespace \
  -f my-values.yaml

Deployment

Step 1: Configure Variables

Create an extra vars file. At minimum you need to set the namespace and decide whether to enable OAuth.

# litemaas-vars.yml
---
ocp4_workload_litemaas_namespace: "litemaas"

# OAuth login (recommended for production)
ocp4_workload_litemaas_oauth_enabled: true

# Scale LiteLLM replicas (default 3)
ocp4_workload_litemaas_ha_litellm_replicas: 3

# Enable RHDP branding
ocp4_workload_litemaas_branding_enabled: true

# Route prefix customization (optional)
ocp4_workload_litemaas_api_route_prefix: "litellm-prod"
ocp4_workload_litemaas_frontend_route_prefix: "litellm-prod-frontend"
ocp4_workload_litemaas_admin_route_prefix: "litellm-prod-admin"

# Pre-configure models (optional — can also be done via UI after deploy)
ocp4_workload_litemaas_litellm_models:
  - model_name: "granite-3-2-8b-instruct"
    litellm_model: "openai/granite-3-2-8b-instruct"
    api_base: "http://granite-3-2-8b-instruct-predictor.llm-hosting.svc.cluster.local/v1"
    api_key: "sk-placeholder"
    rpm: 120
    tpm: 200000

Step 2: Run the Playbook

# Deploy LiteMaaS in HA mode
ansible-playbook playbooks/deploy_litemaas_ha.yml \
  -e @litemaas-vars.yml

# Or pass variables inline
ansible-playbook playbooks/deploy_litemaas_ha.yml \
  -e ocp4_workload_litemaas_namespace=litemaas \
  -e ocp4_workload_litemaas_oauth_enabled=true \
  -e ocp4_workload_litemaas_branding_enabled=true

Step 3: Deployment Order

The HA task deploys components in this sequence. Each component waits for its predecessor to be ready before continuing:

graph LR A[Namespace] --> B[PostgreSQL 16
StatefulSet + PVC] B --> C[Redis 7
Deployment] C --> D[LiteLLM Proxy
3 replicas] D --> E[LiteMaaS Backend
3 replicas] E --> F[Branding ConfigMaps
if enabled] F --> G[LiteMaaS Frontend
3 replicas] G --> H[Routes + OAuthClient] style A fill:#e8e8e8 style D fill:#fef0f0,stroke:#cc0000 style E fill:#f0f4ff,stroke:#0066cc style G fill:#f0faf2,stroke:#3e8635

Step 4: Post-Deploy Output

After successful deployment, the playbook prints connection details via agnosticd_user_info:

# Example output
LiteLLM Admin Portal: https://litellm-prod.apps.cluster.example.com
LiteLLM Admin Login: admin / <auto-generated-password>
LiteLLM Master API Key: sk-<auto-generated-key>
LiteMaaS User Interface: https://litellm-prod-frontend.apps.cluster.example.com

Complete Variable Reference

All variables with their defaults from roles/ocp4_workload_litemaas/defaults/main.yml.

Core Settings

VariableDefaultRequiredDescription
ocp4_workload_litemaas_namespacelitemaasoptionalOpenShift namespace to deploy into
ocp4_workload_litemaas_version0.2.2optionalCollection version tag (metadata only)
ocp4_workload_litemaas_removefalseoptionalSet to true to uninstall (deletes namespace)
ocp4_workload_litemaas_cluster_domain""optionalCluster apps domain (auto-detected from cluster if empty)

Image Configuration

VariableDefaultDescription
ocp4_workload_litemaas_litellm_imagequay.io/rh-aiservices-bu/litellm-non-rootLiteLLM proxy image repository
ocp4_workload_litemaas_litellm_tagmain-v1.81.0-stableLiteLLM image tag
ocp4_workload_litemaas_backend_imagequay.io/rhpds/litemaasLiteMaaS backend image repository
ocp4_workload_litemaas_backend_tagbackend-0.2.2Backend image tag
ocp4_workload_litemaas_frontend_imagequay.io/rhpds/litemaasLiteMaaS frontend image repository
ocp4_workload_litemaas_frontend_tagfrontend-0.2.2Frontend image tag

Production images: The running production deployment uses quay.io/rh-aiservices-bu/litellm-non-root:main-v1.81.0-stable-custom for LiteLLM and quay.io/rh-aiservices-bu/litemaas-backend:0.4.0 / quay.io/rh-aiservices-bu/litemaas-frontend:0.4.0 for LiteMaaS. The production image registry is quay.io/rh-aiservices-bu, not the default quay.io/rhpds.

Route Configuration

VariableDefaultDescription
ocp4_workload_litemaas_api_route_prefixlitellmSubdomain prefix for the LiteLLM API route
ocp4_workload_litemaas_admin_route_prefixlitellm-adminSubdomain prefix for the admin/backend route
ocp4_workload_litemaas_frontend_route_prefixlitellm-frontendSubdomain prefix for the frontend UI route

Credentials (Auto-Generated)

VariableDefaultDescription
ocp4_workload_litemaas_postgres_passwordrandom 32-charPostgreSQL database password
ocp4_workload_litemaas_jwt_secretrandom 32-charJWT signing secret for LiteMaaS backend sessions
ocp4_workload_litemaas_admin_api_keyrandom 32-charLiteMaaS admin API key (stored in backend-secret)
ocp4_workload_litemaas_litellm_api_keysk- + random 32-charLiteLLM master key (stored in litellm-secret)
ocp4_workload_litemaas_litellm_ui_usernameadminLiteLLM admin UI username
ocp4_workload_litemaas_litellm_ui_passwordrandom 16-charLiteLLM admin UI password

Security: Never commit actual credentials to version control. The Ansible role generates secure random values at deploy time. Retrieve them after deployment from the OpenShift secrets, not from code.

PostgreSQL Configuration

VariableDefaultDescription
ocp4_workload_litemaas_postgres_version16PostgreSQL major version
ocp4_workload_litemaas_postgres_storage_size10GiPVC size for PostgreSQL data
ocp4_workload_litemaas_postgres_storage_class""StorageClass (empty = auto-detect per cloud)
ocp4_workload_litemaas_postgres_storage_access_modeReadWriteOncePVC access mode

HA-Specific Variables

VariableDefaultDescription
ocp4_workload_litemaas_ha_litellm_replicas3Number of LiteLLM proxy replicas
ocp4_workload_litemaas_ha_enable_redistrueDeploy Redis for session/key caching
ocp4_workload_litemaas_ha_enable_postgrestrueDeploy PostgreSQL (set false to use external DB)
ocp4_workload_litemaas_ha_redis_imageregistry.redhat.io/rhel9/redis-7:latestPrimary Redis image
ocp4_workload_litemaas_ha_redis_image_fallbackquay.io/sclorg/redis-7-c9s:latestFallback Redis image
ocp4_workload_litemaas_ha_postgres_pvc_size10GiPVC size for HA PostgreSQL
ocp4_workload_litemaas_ha_litellm_memory_limit2GiPer-pod memory limit for LiteLLM
ocp4_workload_litemaas_ha_litellm_cpu_limit2000mPer-pod CPU limit for LiteLLM

OAuth Configuration

VariableDefaultDescription
ocp4_workload_litemaas_oauth_enabledfalseEnable OpenShift OAuth login
ocp4_workload_litemaas_oauth_provideropenshiftOAuth provider type (openshift or oidc)
ocp4_workload_litemaas_oauth_client_idnamespace nameOAuthClient resource name
ocp4_workload_litemaas_oauth_client_secretrandom 32-charOAuthClient secret
ocp4_workload_litemaas_admin_emails[admin]Initial admin user list

Resource Limits

VariableDefaultDescription
ocp4_workload_litemaas_litellm_memory_limit1GiLiteLLM memory limit (single-replica mode)
ocp4_workload_litemaas_litellm_cpu_limit1000mLiteLLM CPU limit
ocp4_workload_litemaas_backend_memory_limit512MiBackend memory limit
ocp4_workload_litemaas_backend_cpu_limit500mBackend CPU limit
ocp4_workload_litemaas_frontend_memory_limit256MiFrontend memory limit
ocp4_workload_litemaas_frontend_cpu_limit250mFrontend CPU limit

Cloud-Specific Storage Classes

When ocp4_workload_litemaas_postgres_storage_class is empty, the role auto-detects the cloud provider and selects the appropriate StorageClass:

Cloud ProviderStorage Class
AWSgp3-csi
Azuremanaged-premium
GCPstandard-rwo
vSpherethin-csi
OpenStackstandard
Bare Metalocs-external-storagecluster-ceph-rbd

LiteLLM Database Migration

LiteLLM uses Prisma ORM and runs database migrations automatically on startup. However, when upgrading between LiteLLM versions, you may need to trigger a migration manually if automatic migration fails or is skipped.

Check Migration Status

# Check LiteLLM logs for migration output
oc logs -n litellm-rhpds deployment/litellm --tail=100 | grep -i migrat

# Check if database tables exist
oc exec -n litellm-rhpds litellm-postgres-0 -- \
  psql -U litellm -d litellm -c "\dt"

Run Migration Manually

# Get the DATABASE_URL from the running LiteLLM deployment
DB_URL=$(oc exec -n litellm-rhpds deployment/litellm -- \
  sh -c 'echo $DATABASE_URL')

# Exec into a LiteLLM pod and run migration
LITELLM_POD=$(oc get pods -n litellm-rhpds -l app=litellm \
  -o jsonpath='{.items[0].metadata.name}')

oc exec -n litellm-rhpds $LITELLM_POD -- \
  python -c "from litellm.proxy.proxy_server import prisma_client; prisma_client.db.migrate.deploy()"

# Alternative: restart the deployment (migration runs on startup)
oc rollout restart deployment/litellm -n litellm-rhpds
oc rollout status deployment/litellm -n litellm-rhpds

Caution: Always take a database backup before running migrations between major LiteLLM versions. See the Operations page for upgrade procedures.

Post-Install Checklist

After deployment completes, verify the following before announcing the service to users.

Resource Summary (Standard HA Deployment)

ComponentReplicasCPU RequestMemory RequestStorage
LiteLLM proxy33 × 500m = 1500m3 × 512Mi = 1.5Gi
LiteMaaS backend33 × 100m = 300m3 × 256Mi = 768Mi
LiteMaaS frontend33 × 50m = 150m3 × 128Mi = 384Mi
PostgreSQL 161 (StatefulSet)500m512Mi10Gi RWO
Redis 71200m256Mi
Total~2650m~3.4Gi10Gi

Removal / Uninstall

# Remove via playbook (cleanly removes OAuthClient too)
ansible-playbook playbooks/deploy_litemaas_ha.yml \
  -e ocp4_workload_litemaas_namespace=litemaas \
  -e ocp4_workload_litemaas_remove=true

# Or delete the namespace directly (OAuthClient persists — clean it up separately)
oc delete namespace litemaas
oc delete oauthclient litemaas

Deleting the namespace also deletes the PostgreSQL PVC and all data. Take a backup first if you need to preserve data.