RHDP LiteMaaS

Model as a Service for Red Hat Demo Platform

Day-2 Operations

Upgrading, scaling, model management, key cleanup, and database synchronization for a running LiteMaaS deployment.

On This Page

Upgrading LiteLLM and LiteMaaS Images

In production, upgrades are handled by updating the deployment image tags. LiteMaaS uses rolling deployments — pods are replaced one at a time, so the service remains available during upgrades.

Current Production Images (from oc get deployment -n litellm-rhpds -o wide)

DeploymentCurrent ImageVersion
litellmquay.io/rh-aiservices-bu/litellm-non-rootmain-v1.81.0-stable-custom
litellm-backendquay.io/rh-aiservices-bu/litemaas-backend0.4.0
litellm-frontendquay.io/rh-aiservices-bu/litemaas-frontend0.4.0
litellm-redisregistry.redhat.io/rhel9/redis-7latest

Upgrade LiteMaaS Backend/Frontend

# Set namespace variable
NS=litellm-rhpds
NEW_VERSION=0.5.0

# Update backend image
oc set image deployment/litellm-backend \
  backend=quay.io/rh-aiservices-bu/litemaas-backend:${NEW_VERSION} \
  -n ${NS}

# Update frontend image
oc set image deployment/litellm-frontend \
  frontend=quay.io/rh-aiservices-bu/litemaas-frontend:${NEW_VERSION} \
  -n ${NS}

# Monitor rollout
oc rollout status deployment/litellm-backend -n ${NS}
oc rollout status deployment/litellm-frontend -n ${NS}

Frontend version mismatch: If the frontend shows a stale version after upgrade, check that the init container image (if used for static asset injection) matches the new frontend tag. A mismatch between the main container and init container is a common cause of version confusion. See Troubleshooting for the fix.

Upgrade LiteLLM Proxy

# Before upgrading LiteLLM, take a database backup
oc exec -n ${NS} litellm-postgres-0 -- \
  pg_dump -U litellm litellm | gzip > /tmp/litellm-pre-upgrade.sql.gz

# Update LiteLLM image
NEW_LITELLM_TAG=main-v1.85.0-stable-custom
oc set image deployment/litellm \
  litellm=quay.io/rh-aiservices-bu/litellm-non-root:${NEW_LITELLM_TAG} \
  -n ${NS}

# Watch the rollout -- LiteLLM runs DB migrations on startup
oc rollout status deployment/litellm -n ${NS}

# Check logs for migration output
oc logs -n ${NS} deployment/litellm --tail=50 | grep -i migrat

Upgrade via Ansible (Re-deploy)

# Override image tags in your vars file, then re-run the playbook
ansible-playbook playbooks/deploy_litemaas_ha.yml \
  -e ocp4_workload_litemaas_namespace=litellm-rhpds \
  -e ocp4_workload_litemaas_litellm_tag=main-v1.85.0-stable-custom \
  -e ocp4_workload_litemaas_version=0.5.0

Scaling Replicas

All three main deployments (LiteLLM, Backend, Frontend) support horizontal scaling. Scaling is stateless for Backend and Frontend; LiteLLM uses Redis for shared session state across replicas.

Scale LiteLLM (the most common scale target)

# Scale LiteLLM to 5 replicas
oc scale deployment/litellm --replicas=5 -n litellm-rhpds

# Or via patch
oc patch deployment/litellm -n litellm-rhpds \
  --type=json -p='[{"op":"replace","path":"/spec/replicas","value":5}]'

# Verify
oc get deployment/litellm -n litellm-rhpds

Scale Backend and Frontend

oc scale deployment/litellm-backend --replicas=3 -n litellm-rhpds
oc scale deployment/litellm-frontend --replicas=3 -n litellm-rhpds

Check Current Scale

oc get deployments -n litellm-rhpds -o wide

Redis is required for multi-replica LiteLLM. Without Redis, each LiteLLM pod has its own in-memory session cache. Key validation will be inconsistent across replicas if Redis is not running. Verify Redis is healthy before scaling LiteLLM beyond 1 replica: oc get pods -n litellm-rhpds -l app=litellm-redis

Key Cleanup Cronjob

In a production deployment, virtual API keys accumulate over time. Workshop participants receive 30-day keys, and after the workshop ends, those keys remain in LiteLLM's database consuming storage and complicating admin views. The key cleanup cronjob handles this automatically.

What the Cronjob Does

  1. Runs daily at 2 AM on the bastion host
  2. Fetches all virtual keys from the LiteLLM API (with pagination, 100 keys per page)
  3. Identifies keys that match either deletion condition:
    • Keys where expires timestamp is in the past (expired keys)
    • Keys created more than 30 days ago (calculated from expires - duration metadata)
  4. Deletes each matching key via POST /key/delete
  5. For each deleted key: syncs the api_keys table in LiteMaaS PostgreSQL to mark it inactive with revoked_at timestamp and sync_status='error'
  6. Final pass: marks any remaining LiteMaaS api_keys records inactive if their corresponding LiteLLM_VerificationToken no longer exists
  7. All output is written to /var/log/litemaas-key-cleanup.log with logrotate (30-day retention)

Setup

# Run from the rhpds.litemaas repository root
# Detects whether running on bastion (direct write) or workstation (SSH)
./setup-key-cleanup-cronjob.sh litellm-rhpds

The script creates:

Manual Test Run

# Run manually (dry-run not supported — this will actually delete keys)
sudo /usr/local/bin/cleanup-litemaas-keys-litellm-rhpds.sh

# Tail the log
sudo tail -f /var/log/litemaas-key-cleanup.log

# Verify cronjob is installed
sudo crontab -l | grep cleanup-litemaas

Remove the Cronjob

sudo crontab -l | grep -v cleanup-litemaas-keys-litellm-rhpds.sh | sudo crontab -
sudo rm /usr/local/bin/cleanup-litemaas-keys-litellm-rhpds.sh

Adding Models

Models must be registered in two places: the LiteLLM proxy (which handles actual inference routing) and the LiteMaaS backend database (which powers the user-facing model catalog). Failing to sync both causes the "subscription foreign key violation" error.

Method 1: LiteLLM Admin UI + Manual Sync (Quick)

  1. Log in to the LiteLLM admin UI at the API route URL (e.g., https://litellm-prod.apps.maas.redhatworkshops.io)
  2. Navigate to Models → Add Model
  3. Fill in:
    • Provider: OpenAI-Compatible (for RHOAI KServe models)
    • Model Name: e.g., llama-3-1-70b-instruct
    • API Base: Internal KServe service URL, e.g., http://llama-3-1-70b-predictor.llm-hosting.svc.cluster.local/v1
    • API Key: Service account token or sk-placeholder if no auth required
  4. Click Test Connect, then Add Model
  5. Run the sync playbook to push the model to LiteMaaS backend DB
# After adding via UI, sync to LiteMaaS backend database
LITELLM_URL=$(oc get route litellm-prod -n litellm-rhpds \
  -o jsonpath='https://{.spec.host}')
LITELLM_KEY=$(oc get secret litellm-secret -n litellm-rhpds \
  -o jsonpath='{.data.LITELLM_MASTER_KEY}' | base64 -d)

ansible-playbook playbooks/manage_models.yml \
  -e litellm_url="${LITELLM_URL}" \
  -e litellm_master_key="${LITELLM_KEY}" \
  -e ocp4_workload_litemaas_models_namespace=litellm-rhpds \
  -e ocp4_workload_litemaas_models_sync_from_litellm=true \
  -e '{"ocp4_workload_litemaas_models_list": []}'

Method 2: Ansible Playbook (Full)

# Create a model config file
cat > new-model.yml <<'EOF'
litellm_url: "https://litellm-prod.apps.maas.redhatworkshops.io"
litellm_master_key: "sk-xxxxx"
ocp4_workload_litemaas_models_namespace: "litellm-rhpds"
ocp4_workload_litemaas_models_backend_enabled: true
ocp4_workload_litemaas_models_list:
  - model_name: "llama-3-1-70b-instruct"
    litellm_model: "openai/llama-3-1-70b-instruct"
    api_base: "http://llama-3-1-70b-predictor.llm-hosting.svc.cluster.local/v1"
    api_key: "sk-placeholder"
    display_name: "Llama 3.1 70B Instruct"
    description: "Meta Llama 3.1 70B instruction-tuned model"
    provider: "openshift-ai"
    category: "chat"
    context_length: 131072
    rpm: 30
    tpm: 500000
EOF

ansible-playbook playbooks/manage_models.yml -e @new-model.yml

Verify Model Registration

# Check LiteLLM has the model
curl -X GET "${LITELLM_URL}/model/info" \
  -H "Authorization: Bearer ${LITELLM_KEY}" | \
  jq '.data[] | select(.model_name == "llama-3-1-70b-instruct") | .model_name'

# Check LiteMaaS backend DB has the model
oc exec -n litellm-rhpds litellm-postgres-0 -- \
  psql -U litellm -d litellm -c \
  "SELECT id, name, provider, availability FROM models WHERE id = 'llama-3-1-70b-instruct';"

Model Parameters Reference

ParameterRequiredDefaultDescription
model_nameYesUnique identifier passed to LiteLLM and used as model ID
litellm_modelYesLiteLLM model format: openai/model-id for OpenAI-compatible endpoints
api_baseYesInference endpoint URL. Use internal ClusterIP for KServe models.
api_keyYesAuth token. Use sk-placeholder if the endpoint requires no auth.
display_nameNomodel_nameHuman-readable name shown in LiteMaaS UI
providerNoopenshift-aiProvider label (informational)
categoryNogeneralType: chat, code, general, embeddings
context_lengthNonullContext window in tokens (display only)
rpmNonullRequests per minute limit enforced by LiteLLM
tpmNonullTokens per minute limit enforced by LiteLLM
supports_streamingNotrueEnable streaming responses

LITELLM_AUTO_SYNC

The LITELLM_AUTO_SYNC environment variable controls whether the LiteMaaS backend automatically synchronizes its model database from LiteLLM on startup and at periodic intervals. When enabled, the backend pulls the current model list from LiteLLM and updates its own models table accordingly.

Check Current Setting

oc exec -n litellm-rhpds deployment/litellm-backend -- \
  sh -c 'echo LITELLM_AUTO_SYNC=$LITELLM_AUTO_SYNC'

Enable Auto-Sync

# Set env var on the backend deployment
oc set env deployment/litellm-backend \
  LITELLM_AUTO_SYNC=true \
  -n litellm-rhpds

# Restart to apply
oc rollout restart deployment/litellm-backend -n litellm-rhpds

When to Use Auto-Sync

ScenarioRecommendation
Models added only via Ansible playbook Auto-sync unnecessary — playbook handles both LiteLLM and backend DB
Models frequently added/removed via LiteLLM admin UI Enable auto-sync to avoid running the sync playbook after each change
Production with tight change control Disable auto-sync, use explicit sync playbook so changes are deliberate

Auto-sync and model deletion: If a model is deleted from LiteLLM and auto-sync is enabled, the backend will remove it from its database too — which will cascade-delete user subscriptions to that model. Be cautious about deleting models from LiteLLM when users have active subscriptions.

DB Sync Between LiteMaaS and LiteLLM

LiteMaaS maintains its own database tables (users, subscriptions, api_keys, models) that must stay in sync with LiteLLM's LiteLLM_VerificationToken and LiteLLM_ModelTable tables. Divergence between the two databases is the root cause of most operational issues.

Understanding the Two Databases

Both LiteMaaS backend and LiteLLM proxy connect to the same PostgreSQL instance (litellm-postgres-0), but use different schemas/tables:

TableOwnerPurpose
usersLiteMaaS backendUser accounts, roles, OAuth IDs
modelsLiteMaaS backendModel catalog with capability metadata
subscriptionsLiteMaaS backendUser-to-model subscription records
api_keysLiteMaaS backendVirtual key tracking with LiteLLM alias references
LiteLLM_VerificationTokenLiteLLM proxyThe actual virtual key records with spend/limits
LiteLLM_ModelTableLiteLLM proxyLiteLLM's internal model registry
LiteLLM_SpendLogsLiteLLM proxyPer-request spend audit trail

Check for Orphaned Keys (LiteMaaS keys with no LiteLLM record)

oc exec -n litellm-rhpds litellm-postgres-0 -- \
  psql -U litellm -d litellm -c "
SELECT ak.id, ak.litellm_key_alias, ak.is_active, ak.created_at
FROM api_keys ak
WHERE ak.is_active = true
  AND ak.litellm_key_alias IS NOT NULL
  AND NOT EXISTS (
    SELECT 1 FROM \"LiteLLM_VerificationToken\" lv
    WHERE lv.key_alias = ak.litellm_key_alias
  )
LIMIT 20;"

Fix Orphaned Keys

# Mark orphaned LiteMaaS keys as inactive
oc exec -n litellm-rhpds litellm-postgres-0 -- \
  psql -U litellm -d litellm -c "
UPDATE api_keys
SET is_active = false,
    revoked_at = NOW(),
    sync_status = 'error',
    sync_error = 'Key not found in LiteLLM - manual cleanup',
    updated_at = NOW()
WHERE is_active = true
  AND litellm_key_alias IS NOT NULL
  AND NOT EXISTS (
    SELECT 1 FROM \"LiteLLM_VerificationToken\" lv
    WHERE lv.key_alias = api_keys.litellm_key_alias
  );"

Check for Models in LiteLLM but Not in LiteMaaS

# List LiteLLM models missing from LiteMaaS models table
curl -X GET "${LITELLM_URL}/model/info" \
  -H "Authorization: Bearer ${LITELLM_KEY}" | \
  jq '.data[].model_name' | sort > /tmp/litellm-models.txt

oc exec -n litellm-rhpds litellm-postgres-0 -- \
  psql -U litellm -d litellm -t -c "SELECT id FROM models;" | \
  sort > /tmp/litemaas-models.txt

# Show models in LiteLLM but not in LiteMaaS
diff /tmp/litellm-models.txt /tmp/litemaas-models.txt

Trigger Manual Sync

# Backend exposes a sync endpoint (admin auth required)
ADMIN_KEY=$(oc get secret backend-secret -n litellm-rhpds \
  -o jsonpath='{.data.ADMIN_API_KEY}' | base64 -d)
BACKEND_URL=$(oc get route litellm-prod-admin -n litellm-rhpds \
  -o jsonpath='https://{.spec.host}')

curl -X POST "${BACKEND_URL}/api/admin/sync-models" \
  -H "Authorization: Bearer ${ADMIN_KEY}"

Admin User Management

Promote a User to Admin

The user must have logged in via OAuth at least once before being promoted.

# Using the helper script (recommended)
./promote-admin.sh litellm-rhpds user@redhat.com

# Or directly via psql
oc exec -n litellm-rhpds \
  $(oc get pods -n litellm-rhpds -l app=litellm-postgres -o name | head -1) -- \
  psql -U litellm -d litellm -c \
  "UPDATE users SET roles = ARRAY['admin', 'user'] WHERE email = 'user@redhat.com';"

# Verify
oc exec -n litellm-rhpds litellm-postgres-0 -- \
  psql -U litellm -d litellm -c \
  "SELECT username, email, roles FROM users WHERE 'admin' = ANY(roles);"

Retrieve Credentials

# LiteMaaS admin API key
oc get secret backend-secret -n litellm-rhpds \
  -o jsonpath='{.data.ADMIN_API_KEY}' | base64 -d

# LiteLLM master key
oc get secret litellm-secret -n litellm-rhpds \
  -o jsonpath='{.data.LITELLM_MASTER_KEY}' | base64 -d

# JWT secret (for session debugging)
oc get secret backend-secret -n litellm-rhpds \
  -o jsonpath='{.data.JWT_SECRET}' | base64 -d

Backup Operations

Setup Automated Monthly Backup

# Requires S3 bucket with IAM role attached to bastion EC2 instance
./setup-litemaas-backup-cronjob.sh litellm-rhpds maas-db-backup

Manual Database Backup

# Backup LiteMaaS + LiteLLM shared database
oc exec -n litellm-rhpds litellm-postgres-0 -- \
  pg_dump -U litellm litellm | gzip > litellm-backup-$(date +%Y%m%d).sql.gz

# Upload to S3
aws s3 cp litellm-backup-$(date +%Y%m%d).sql.gz \
  s3://maas-db-backup/litemaas-backups/

List and Verify Backups

aws s3 ls s3://maas-db-backup/litemaas-backups/ --human-readable | sort -r

Quick Reference Commands

# Namespace shortcut
NS=litellm-rhpds

# Get all pod status
oc get pods -n $NS

# Get all routes
oc get routes -n $NS

# Tail LiteLLM logs (model routing)
oc logs -n $NS deployment/litellm -f --tail=100

# Tail LiteMaaS backend logs (subscription/key ops)
oc logs -n $NS deployment/litellm-backend -f --tail=100

# Tail frontend logs
oc logs -n $NS deployment/litellm-frontend -f --tail=100

# Health check
ROUTE=$(oc get route litellm-prod -n $NS -o jsonpath='{.spec.host}')
curl -sk https://$ROUTE/health/livenessz

# List all virtual keys (paginated)
LITELLM_KEY=$(oc get secret litellm-secret -n $NS \
  -o jsonpath='{.data.LITELLM_MASTER_KEY}' | base64 -d)
curl "https://$ROUTE/key/list?return_full_object=true&size=100&page=1" \
  -H "Authorization: Bearer $LITELLM_KEY" | jq '.keys | length'

# Count active subscriptions in LiteMaaS DB
oc exec -n $NS litellm-postgres-0 -- \
  psql -U litellm -d litellm -c \
  "SELECT COUNT(*) FROM subscriptions WHERE status = 'active';"

# Force Redis cache flush (after model changes)
REDIS_POD=$(oc get pods -n $NS -l app=litellm-redis -o name | head -1)
oc exec -n $NS $REDIS_POD -- redis-cli FLUSHALL

# Restart all LiteMaaS components (in order)
oc rollout restart deployment/litellm-backend -n $NS
oc rollout restart deployment/litellm -n $NS
oc rollout restart deployment/litellm-frontend -n $NS