Guardrails

Block sensitive information, PII, credentials, and harmful content from flowing through LiteMaaS — with zero added latency.

Overview #

LiteMaaS guardrails use LiteLLM's built-in litellm_content_filter to intercept requests and responses at the proxy layer. No external service or additional infrastructure is required — all pattern matching runs in-process using precompiled regex.

Two guardrail layers are active by default:

Guardrail	Mode	What it does
`pii-and-credentials`	`pre_call`	Blocks requests containing SSNs, credit cards, AWS/GitHub/generic API keys, and harmful content (explosives, terrorism, violence, hate speech, self-harm) before they reach the model.
`output-pii-mask`	`post_call`	Scans model responses and masks any PII or credentials that appear in the output (replaces with `[REDACTED]`).

Production status: Both guardrails are active on maas.redhatworkshops.io (namespace: litellm-rhpds) as of 2026-04-01.

How It Works #

Guardrail configuration lives in a Kubernetes ConfigMap (litellm-guardrails-config) mounted into the LiteLLM deployment at /etc/litellm/config.yaml. LiteLLM is started with --config /etc/litellm/config.yaml, which loads the guardrails on startup.

Because models are stored in the database (STORE_MODEL_IN_DB=true), the config file only needs to define guardrails — no model_list is required. The two systems coexist cleanly.

Request flow

User request
    │
    ▼
[pii-and-credentials guardrail]  ← pre_call: regex scan
    │ match → 403 Blocked
    │ no match ↓
    ▼
LiteLLM routes to model
    │
    ▼
Model response
    │
    ▼
[output-pii-mask guardrail]  ← post_call: regex scan
    │ match → MASK redaction
    │ no match ↓
    ▼
Response returned to user

Zero latency: All pattern matching is pure regex running in the LiteLLM process. There are no external API calls, no additional network hops, and no inference requests. The impact on request latency is negligible.

Deploying with the Playbook #

The configure_guardrails.yml playbook handles everything: creating the ConfigMap, patching the deployment, waiting for rollout, and verifying the guardrails are active.

Prerequisites

Logged into the target OpenShift cluster (oc login ...)
kubernetes.core Ansible collection installed
Access to the target namespace

Apply to production

ansible-playbook playbooks/configure_guardrails.yml \
  -e litemaas_namespace=litellm-rhpds

Apply to dev or staging

ansible-playbook playbooks/configure_guardrails.yml \
  -e litemaas_namespace=litellm-dev

ansible-playbook playbooks/configure_guardrails.yml \
  -e litemaas_namespace=litellm-staging

Check current guardrails without applying

ansible-playbook playbooks/configure_guardrails.yml \
  -e litemaas_namespace=litellm-rhpds \
  -e litemaas_guardrails_check_only=true

What the playbook does

Resolves the LiteLLM route and master key from OpenShift secrets
Creates or updates the litellm-guardrails-config ConfigMap
Patches the litellm deployment to mount the ConfigMap and set --config startup args
Waits for rolling restart to complete (180s timeout)
Verifies the /guardrails/list API endpoint returns both guardrails
Runs a live SSN blocking test and reports pass/fail

Idempotent: Safe to run multiple times. Kubernetes strategic merge patch uses name as the merge key for volumes and containers — re-running the playbook will not duplicate volumes or mounts.

Prebuilt Patterns Reference #

These regex patterns ship with LiteLLM 1.81.0 and are available without any additional setup.

PII Patterns BLOCK

Pattern Name	Description
`us_ssn`	US Social Security Numbers (XXX-XX-XXXX)
`us_ssn_no_dash`	US SSN without dashes (XXXXXXXXX)
`email`	Email addresses
`us_phone`	US phone numbers in various formats
`passport_us`	US passport numbers

Payment Card Patterns BLOCK

Pattern Name	Description
`credit_card`	Any major credit card number
`visa`	Visa card numbers
`mastercard`	Mastercard numbers
`amex`	American Express numbers
`discover`	Discover card numbers

Credential Patterns BLOCK

Pattern Name	Description
`aws_access_key`	AWS access keys (AKIA...)
`aws_secret_key`	AWS secret keys (40-char)
`github_token`	GitHub personal access tokens
`slack_token`	Slack API tokens
`generic_api_key`	Generic API key patterns (sk-xxx, key-xxx, token-xxx)

Dangerous Content Patterns BLOCK

Pattern Name	Description
`explosives`	Explosives and bomb-making terminology
`terrorism`	Terrorism and extremism terminology
`violence_threats`	Violent threats and terminology
`harassment_hate`	Slurs, hate speech, and harassment
`self_harm_suicide`	Self-harm and suicide terminology
`weapons_firearms`	Firearms and ammunition terminology
`illegal_activities`	Illegal activity terminology

Note on generic_api_key: This pattern is broad — it matches strings like sk-xxx, key-xxx, and token-xxx. Monitor for false positives in prompts that discuss API concepts in general terms. Tune or remove if needed.

Harmful Content Categories #

Categories use dictionary-based keyword matching (not LLM inference) loaded from YAML files shipped with LiteLLM. Three categories are enabled by default:

Category	Severity Threshold	Action	Description
`harmful_violence`	medium	BLOCK	Violence, criminal planning, physical harm
`harmful_self_harm`	medium	BLOCK	Self-harm, suicide, eating disorders
`harmful_illegal_weapons`	medium	BLOCK	Illegal weapons, explosives (keyword level)

Additional available categories (not enabled by default):

Category	Description
`bias_gender`	Gender-based discriminatory language
`bias_racial`	Racial and ethnic discrimination
`bias_religious`	Religious discrimination and stereotypes
`bias_sexual_orientation`	Discriminatory language targeting LGBTQ+ individuals
`denied_medical_advice`	Medical advice, diagnosis, or treatment requests
`denied_legal_advice`	Legal advice or representation requests
`denied_financial_advice`	Personalized financial investment advice

Testing Guardrails #

Run these against the production LiteLLM endpoint to verify guardrails are active. Use the team test key.

Quick smoke test

LITELLM_URL="https://litellm-prod.apps.maas.redhatworkshops.io"
TEST_KEY="sk-REDACTED"

# SSN — expect 403
curl -s -o /dev/null -w "%{http_code}" -X POST "$LITELLM_URL/v1/chat/completions" \
  -H "Authorization: Bearer $TEST_KEY" -H "Content-Type: application/json" \
  -d '{"model":"granite-3-2-8b-instruct","messages":[{"role":"user","content":"My SSN is 123-45-6789"}]}'

# AWS key — expect 403
curl -s -o /dev/null -w "%{http_code}" -X POST "$LITELLM_URL/v1/chat/completions" \
  -H "Authorization: Bearer $TEST_KEY" -H "Content-Type: application/json" \
  -d '{"model":"granite-3-2-8b-instruct","messages":[{"role":"user","content":"key AKIAIOSFODNN7EXAMPLE"}]}'

# Violence — expect 403
curl -s -o /dev/null -w "%{http_code}" -X POST "$LITELLM_URL/v1/chat/completions" \
  -H "Authorization: Bearer $TEST_KEY" -H "Content-Type: application/json" \
  -d '{"model":"granite-3-2-8b-instruct","messages":[{"role":"user","content":"I will kill you"}]}'

# Normal message — expect 200
curl -s -o /dev/null -w "%{http_code}" -X POST "$LITELLM_URL/v1/chat/completions" \
  -H "Authorization: Bearer $TEST_KEY" -H "Content-Type: application/json" \
  -d '{"model":"granite-3-2-8b-instruct","messages":[{"role":"user","content":"What is the capital of France?"}],"max_tokens":10}'

View active guardrails (admin)

MASTER_KEY=$(oc get secret litellm-secret -n litellm-rhpds \
  -o jsonpath='{.data.LITELLM_MASTER_KEY}' | base64 -d)

curl -s "$LITELLM_URL/guardrails/list" \
  -H "Authorization: Bearer $MASTER_KEY" | python3 -m json.tool

Expected blocked response

{
  "error": {
    "message": "{'error': 'Content blocked: us_ssn pattern detected', 'pattern': 'us_ssn'}",
    "type": "None",
    "param": "None",
    "code": "403"
  }
}

Customizing Patterns #

To change which patterns are active, edit the litemaas_guardrails_config variable in playbooks/configure_guardrails.yml and re-run the playbook. Changes take effect after the rolling restart completes.

Add a custom regex pattern

# In configure_guardrails.yml, under patterns:
- pattern_type: "regex"
  name: "redhat-internal-id"
  pattern: "RH-[0-9]{6}"
  action: "BLOCK"

Change action from BLOCK to MASK

Use MASK to redact the matched content instead of blocking the entire request. Useful for output filtering where you want the response but with sensitive data removed.

- pattern_type: "prebuilt"
  pattern_name: "email"
  action: "MASK"   # replaces with [email_REDACTED]

Enable additional categories

# Add under categories in the pii-and-credentials guardrail:
- category: "denied_medical_advice"
  enabled: true
  action: "BLOCK"
  severity_threshold: "high"   # high | medium | low

Update ConfigMap without full playbook run

# Edit the ConfigMap directly, then restart to pick up changes
oc edit configmap litellm-guardrails-config -n litellm-rhpds
oc rollout restart deployment/litellm -n litellm-rhpds

Removing Guardrails #

To disable guardrails entirely:

ansible-playbook playbooks/configure_guardrails.yml \
  -e litemaas_namespace=litellm-rhpds \
  -e litemaas_guardrails_state=absent

This removes the ConfigMap and restores the LiteLLM deployment to its original startup command (no --config flag).

Warning: Removing guardrails in production disables all content filtering immediately. All three LiteLLM replicas will restart via rolling update.

← Previous Model Management Next → Analytics & Backup

RHDP LiteMaaS