RHDP LiteMaaS

Model as a Service for Red Hat Demo Platform

Claude Code MCP Integration

Connect Claude Code directly to LiteMaaS — query models, users, spend, and pod status in plain English without leaving your terminal.

On This Page

What Is the MCP Server?

The LiteMaaS MCP (Model Context Protocol) server lets Claude Code query and manage your LiteMaaS deployment in plain English. Instead of running oc exec commands or writing SQL, you just ask Claude Code a question and it calls the right tool automatically.

The server runs as a pod in the litellm-rhpds namespace on the MaaS cluster, with direct in-cluster access to PostgreSQL and the LiteLLM API. It exposes an HTTP endpoint that Claude Code connects to.

No local installation needed. The MCP server is already deployed and running. You only need to add one configuration entry to your Claude Code setup.

Setup (2 Steps)

Your LiteMaaS admin will give you two things: the MCP server URL and an API token. Both are required — the token is embedded in the URL as a query parameter.

What to ask your admin: "Can you give me the LiteMaaS MCP URL and token?" They will provide something like https://litemaas-mcp.apps.<cluster>/mcp?token=<your-token> as a single string to copy-paste.

Option A — One Command (Recommended)

Run this once in your terminal, replacing the URL and token with the values your admin gave you:

claude mcp add --transport http --scope user litemaas \
  "https://litemaas-mcp.apps.maas.example.com/mcp?token=abc123xyz456def789"

Then restart Claude Code. Done.

Option B — Edit ~/.claude.json

Open ~/.claude.json and add litemaas to the mcpServers object:

{
  "mcpServers": {
    "litemaas": {
      "type": "http",
      "url": "https://litemaas-mcp.apps.maas.example.com/mcp?token=abc123xyz456def789"
    }
  }
}

Replace maas.example.com and abc123xyz456def789 with the actual values from your admin. Save the file and restart Claude Code.

Treat the URL as a secret. The token appears in the URL — don't share it publicly or commit it to source control.

Verify the Connection

After restarting, run /mcp in Claude Code. You should see litemaas listed as connected. Then open a new conversation and ask:

What models are available in LiteMaaS?

Claude Code will call the list_models tool and return a live response from the server.

Start a new conversation after connecting. MCP tool definitions are injected at the start of each session. If you configure MCP while a conversation is already open, the tools won't appear until you open a fresh one.

Available Tools

Claude Code picks the right tool automatically based on your question. You never call tools directly.

ToolWhat it doesKey parameters
list_models All models in LiteMaaS — name, provider, availability, pricing ($/1M tokens), TPM/RPM limits, restricted access flag none
get_model_health Live health check via LiteLLM — healthy vs unhealthy backend endpoints for a specific model or all models model (optional — empty checks all)
list_users All users sorted by spend — email, cumulative spend, max budget, budget duration, TPM/RPM limits none
get_spend_summary Token and dollar spend from LiteLLM logs, grouped by model. Supports date range, model, and user filters date_from, date_to, model, user_id (all optional)
get_daily_stats Day-by-day totals — requests, tokens, spend, and failures for the past N days days (default: 7)
get_pod_status Pod status in the litellm-rhpds namespace — phase, readiness, restart count, age in hours none
update_user_budget Update a user's spending limit and reset period via the LiteLLM API user_id (required), max_budget (required), budget_duration
list_virtual_keys Virtual API keys — alias, spend, budget, scoped models, expiry date (latest 100 keys) none

Example Queries

Type these directly in Claude Code — no commands or syntax needed.

Models & Health

What models are available in LiteMaaS?
Is llama-scout-17b healthy right now?
Which models have restricted access?
Check the health of all models

Usage & Spend

Who has spent the most this week?
Show me today's spend broken down by model
How many tokens did we serve yesterday?
Show me daily stats for the past 14 days
What was the spend on March 25th?
How many requests failed today?

Users & Keys

List all users and their budgets
Set alopezme's budget to $500 daily
Show all virtual keys and their spend
Which keys expire this month?

Infrastructure

Are all pods running?
Which pods have restarted recently?

For Admins

Deployment

The MCP server is deployed in the litellm-rhpds namespace alongside the main LiteMaaS stack. It is off by default in the Ansible collection and must be enabled explicitly:

# Enable MCP server during deployment
ansible-playbook playbooks/deploy_litemaas_ha.yml \
  -e ocp4_workload_litemaas_namespace=litemaas \
  -e ocp4_workload_litemaas_deploy_mcp=true

Key defaults

VariableDefaultDescription
ocp4_workload_litemaas_deploy_mcpfalseEnable MCP server deployment
ocp4_workload_litemaas_mcp_imagequay.io/rhpds/litemaas-mcp:latestContainer image
ocp4_workload_litemaas_mcp_route_prefixlitemaas-mcpRoute hostname prefix

Architecture

The MCP server uses the MCP 2025-03-26 Streamable HTTP transport. It connects to:

Health check

# Check MCP server is running and connected
curl https://litemaas-mcp.apps.maas.example.com/health

# Expected response
{"status": "healthy", "database": true, "litellm": true}

# Test auth — should return 401
curl -X POST https://litemaas-mcp.apps.maas.example.com/mcp

# Test with token — should NOT return 401
curl -X POST "https://litemaas-mcp.apps.maas.example.com/mcp?token=abc123xyz456def789"

Logs

oc logs -n litellm-rhpds -l app=litemaas-mcp --tail=50