Three-Layer GitOps Architecture for Shared Cluster Labs
Infra / Platform / Tenant — a general pattern for any lab running on a shared OpenShift cluster
Three-Layer Architecture: Infra / Platform / Tenant
Each layer has a clear owner, different run frequency, and different lifecycle:
| Layer | Responsibility | Runs | How triggered | MCP Lab example |
|---|---|---|---|---|
| Infra Reusable across labs |
Base cluster capabilities — foundational layer that everything else depends on. Installed once when the cluster is onboarded. | Once per cluster | cluster-provision.yml playbook — run by the developer when cluster is onboarded. See: cluster-provision.yml |
RHBK, Gitea operator, OpenShift GitOps (ArgoCD), Tekton, ToolHive, CloudNativePG, ArgoCD AppProjects. These are generic — any lab on this cluster can rely on them. |
| Platform Lab-specific |
Cluster-wide resources specific to what this lab needs on top of infra. Not per-user — shared across all tenants on the cluster. The lab developer decides what goes here. | Once per cluster — same provisioning run as infra | ArgoCD syncs bootstrap-platform Application |
MCP lab: user workload monitoring ConfigMap. Service mesh lab example: a shared service mesh gateway. AI lab example: a shared LiteLLM instance. Anything cluster-wide but specific to this lab's needs. |
| Tenant Lab-specific |
Per-user isolated environment — created per order, destroyed when done. Same owner as platform, just a different scope (per-user vs cluster-wide). | Every order | AgnosticD runs AgV workloads: list via Sandbox API scheduler |
RHBK user, 6 namespaces, per-tenant Gitea instance, LiteMaaS virtual key. ArgoCD deploys: LibreChat, MCP servers, AI agent, Showroom. |
How OCP Sandbox API and ArgoCD Pick Up Namespaces
One of the most common questions: the GitOps repo has no namespace creation in it, yet everything deploys into the right namespaces. Here is exactly how it works — no magic.
Two ways namespaces are created
Option A — OCP Sandbox API creates them
Each entry in __meta__.sandboxes causes OCP Sandbox API to create one namespace before AgnosticD runs. Each entry gets its own quota: and limit_range:.
Use this when you want OCP Sandbox API to own the namespace lifecycle and you need per-namespace quota control.
Option B — Ansible role creates them
The ocp4_workload_tenant_namespace role creates all namespaces listed in ocp4_workload_tenant_namespace_suffixes. All get the same quota and limit_range. For different limits per namespace, call the role multiple times.
Use this when you need full control over naming and per-namespace RBAC before ArgoCD runs.
How ArgoCD finds the right namespace — the naming contract
Both options follow the same convention: {suffix}-{username}. The bootstrap Helm chart constructs namespace names using this formula. As long as Ansible creates namespaces with the same formula, ArgoCD will find them.
| Layer | Code | Result (username = mcpuser-drw4x) |
|---|---|---|
| Ansible creates namespace | suffix: librechat + prefix: mcpuser-drw4x |
librechat-mcpuser-drw4x |
| ArgoCD Helm targets namespace | printf "librechat-%s" $username in applications.yaml |
librechat-mcpuser-drw4x |
CreateNamespace=false in every ArgoCD Application (see applications.yaml line 160) tells ArgoCD to deploy into the existing namespace. If the namespace does not exist, ArgoCD fails — enforcing that Ansible runs first.
Real example — Summit 2026 / Scheduler-Only: Ansible role creates namespaces Primary pattern
Ansible creates all namespaces before ArgoCD runs. Role source →
# Role: agnosticd.namespaced_workloads.ocp4_workload_tenant_namespace
# Runs BEFORE gitops_bootstrap — namespaces must exist before ArgoCD syncs.
ocp4_workload_tenant_namespace_prefix: "{{ ocp4_workload_tenant_keycloak_username }}"
ocp4_workload_tenant_namespace_namespaces:
- suffix: agent
quota: { limits.cpu: "2", limits.memory: 4Gi }
limit_range:
default: { cpu: 500m, memory: 512Mi }
defaultRequest: { cpu: 50m, memory: 128Mi }
- suffix: librechat
quota: { limits.cpu: "4", limits.memory: 6Gi } # LibreChat + MongoDB + Meilisearch
limit_range:
default: { cpu: 500m, memory: 512Mi }
defaultRequest: { cpu: 50m, memory: 128Mi }
- suffix: mcp-gitea
quota: { limits.cpu: "1", limits.memory: 2Gi } # lightweight MCP proxy
limit_range:
default: { cpu: 500m, memory: 512Mi }
defaultRequest: { cpu: 50m, memory: 128Mi }
Real example — Post-Summit / OCP Sandbox API: platform creates namespaces with per-entry quota
Each sandbox entry creates one namespace with its own quota — no Ansible role needed for namespace creation. Full file →
# Each sandboxes: entry creates one namespace with its own quota.
# OCP Sandbox API creates these BEFORE AgnosticD runs.
__meta__:
sandboxes:
# Primary — also schedules the cluster
- kind: OcpSandbox
namespace_suffix: user
cloud_selector: { cloud: cnv-dedicated-shared, demo: mcp-with-openshift, purpose: prod }
quota:
limits.cpu: "2"
limits.memory: 4Gi
# LibreChat needs more memory
- kind: OcpSandbox
namespace_suffix: librechat
cluster_condition: same('primary')
quota:
limits.cpu: "4"
limits.memory: 8Gi
# MCP servers need less
- kind: OcpSandbox
namespace_suffix: mcp-gitea
cluster_condition: same('primary')
quota:
limits.cpu: "1"
limits.memory: 2Gi
Different limits per namespace — Scheduler-Only pattern
The ocp4_workload_tenant_namespace role applies one quota/limit to all namespaces in the list. To give different namespaces different limits, split them into separate calls using ocp4_workload_tenant_namespace_suffixes with different quota values. Each role call creates a group of namespaces with shared limits.
# First call — heavy namespaces (LibreChat needs more)
workloads:
- agnosticd.namespaced_workloads.ocp4_workload_tenant_namespace
# In AgV, set vars for the first group:
ocp4_workload_tenant_namespace_suffixes:
- librechat
ocp4_workload_tenant_namespace_quota:
limits.cpu: "4"
limits.memory: 8Gi
# NOTE: The role must support a loop or be called via include_role with vars
# for multiple groups. Alternatively, use the OCP Sandbox API pattern (Option A)
# which natively supports per-namespace quota via separate sandboxes: entries.
sandboxes: entries — each entry gets its own quota: block. This is what the MCP lab does and it is shown above.
How This Connects to the Sandbox API
The three-layer pattern maps directly onto the two main Sandbox API integration patterns:
- The Sandbox API scheduler-only pattern maps directly to the Tenant layer. The Sandbox API does one thing: schedule a cluster and return a token and domain. Everything in the Tenant layer is done by your AgnosticD roles. See the Scheduler-Only guide.
-
Cluster Provisioner sets up the Infra and Platform layers once. A
cluster-provision.ymlplaybook (separate from AgV) runs once per cluster when it joins the Sandbox API pool, installing all shared services (RHBK, Gitea, ArgoCD). After that, every order only touches the Tenant layer.
Red Hat Demo Platform (RHDP) — Internal developer reference — GitHub