Model as a Service for Red Hat Demo Platform
Connect Claude Code directly to LiteMaaS — query models, users, spend, and pod status in plain English without leaving your terminal.
The LiteMaaS MCP (Model Context Protocol) server lets Claude Code query and manage your LiteMaaS deployment in plain English. Instead of running oc exec commands or writing SQL, you just ask Claude Code a question and it calls the right tool automatically.
The server runs as a pod in the litellm-rhpds namespace on the MaaS cluster, with direct in-cluster access to PostgreSQL and the LiteLLM API. It exposes an HTTP endpoint that Claude Code connects to.
No local installation needed. The MCP server is already deployed and running. You only need to add one configuration entry to your Claude Code setup.
Your LiteMaaS admin will give you two things: the MCP server URL and an API token. Both are required — the token is embedded in the URL as a query parameter.
What to ask your admin: "Can you give me the LiteMaaS MCP URL and token?" They will provide something like https://litemaas-mcp.apps.<cluster>/mcp?token=<your-token> as a single string to copy-paste.
Run this once in your terminal, replacing the URL and token with the values your admin gave you:
claude mcp add --transport http --scope user litemaas \ "https://litemaas-mcp.apps.maas.example.com/mcp?token=abc123xyz456def789"
Then restart Claude Code. Done.
~/.claude.jsonOpen ~/.claude.json and add litemaas to the mcpServers object:
{
"mcpServers": {
"litemaas": {
"type": "http",
"url": "https://litemaas-mcp.apps.maas.example.com/mcp?token=abc123xyz456def789"
}
}
}
Replace maas.example.com and abc123xyz456def789 with the actual values from your admin. Save the file and restart Claude Code.
Treat the URL as a secret. The token appears in the URL — don't share it publicly or commit it to source control.
After restarting, run /mcp in Claude Code. You should see litemaas listed as connected. Then open a new conversation and ask:
What models are available in LiteMaaS?
Claude Code will call the list_models tool and return a live response from the server.
Start a new conversation after connecting. MCP tool definitions are injected at the start of each session. If you configure MCP while a conversation is already open, the tools won't appear until you open a fresh one.
Claude Code picks the right tool automatically based on your question. You never call tools directly.
| Tool | What it does | Key parameters |
|---|---|---|
list_models |
All models in LiteMaaS — name, provider, availability, pricing ($/1M tokens), TPM/RPM limits, restricted access flag | none |
get_model_health |
Live health check via LiteLLM — healthy vs unhealthy backend endpoints for a specific model or all models | model (optional — empty checks all) |
list_users |
All users sorted by spend — email, cumulative spend, max budget, budget duration, TPM/RPM limits | none |
get_spend_summary |
Token and dollar spend from LiteLLM logs, grouped by model. Supports date range, model, and user filters | date_from, date_to, model, user_id (all optional) |
get_daily_stats |
Day-by-day totals — requests, tokens, spend, and failures for the past N days | days (default: 7) |
get_pod_status |
Pod status in the litellm-rhpds namespace — phase, readiness, restart count, age in hours |
none |
update_user_budget |
Update a user's spending limit and reset period via the LiteLLM API | user_id (required), max_budget (required), budget_duration |
list_virtual_keys |
Virtual API keys — alias, spend, budget, scoped models, expiry date (latest 100 keys) | none |
Type these directly in Claude Code — no commands or syntax needed.
The MCP server is deployed in the litellm-rhpds namespace alongside the main LiteMaaS stack. It is off by default in the Ansible collection and must be enabled explicitly:
# Enable MCP server during deployment
ansible-playbook playbooks/deploy_litemaas_ha.yml \
-e ocp4_workload_litemaas_namespace=litemaas \
-e ocp4_workload_litemaas_deploy_mcp=true
| Variable | Default | Description |
|---|---|---|
ocp4_workload_litemaas_deploy_mcp | false | Enable MCP server deployment |
ocp4_workload_litemaas_mcp_image | quay.io/rhpds/litemaas-mcp:latest | Container image |
ocp4_workload_litemaas_mcp_route_prefix | litemaas-mcp | Route hostname prefix |
The MCP server uses the MCP 2025-03-26 Streamable HTTP transport. It connects to:
http://litellm:4000 for user management and health checks# Check MCP server is running and connected curl https://litemaas-mcp.apps.maas.example.com/health # Expected response {"status": "healthy", "database": true, "litellm": true} # Test auth — should return 401 curl -X POST https://litemaas-mcp.apps.maas.example.com/mcp # Test with token — should NOT return 401 curl -X POST "https://litemaas-mcp.apps.maas.example.com/mcp?token=abc123xyz456def789"
oc logs -n litellm-rhpds -l app=litemaas-mcp --tail=50