Cluster Provisioning

One-time setup — before any lab orders can run

For: Lab developers onboarding a new shared cluster

On this page
  1. What is cluster provisioning?
  2. Infra layer — installing operators
  3. Platform layer — creating instances
  4. How to run it today
  5. Future: AgV catalog item
  6. What you get after provisioning

What is Cluster Provisioning?

Why Ansible for cluster provisioning? You could do all of this via GitOps — it is technically possible. The reason Ansible is used here is that well-tested AgnosticD roles already exist for every step (ocp4_workload_authentication, ocp4_workload_gitea_operator, ocp4_workload_openshift_gitops, etc.). Reusing them is faster and more reliable than recreating the same logic in Helm charts. Where no role exists, GitOps is used instead — for example, enabling user workload monitoring via ArgoCD.

Cluster Provisioning (Once)

  • Install RHBK operator & create Keycloak instance
  • Install Gitea operator & create Gitea instance
  • Install OpenShift GitOps (ArgoCD)
  • Install ToolHive for MCP server management
  • Configure ingress, TLS, OAuth integration
  • Register cluster in Sandbox API pool with tags

User Provisioning (Per Order)

  • Create RHBK user for this attendee
  • Create namespace(s) for this attendee
  • Create Gitea org + mirror repos
  • Bootstrap ArgoCD Application for tenant
  • Generate LiteMaaS API key
  • Deploy Showroom with lab URLs
Think of it like a hotel Cluster provisioning is building the hotel and furnishing the lobby. User provisioning is checking a guest in and handing them a room key. The hotel only needs to be built once; guests check in and out continuously.

Infra Layer — Installing Operators

The infra layer installs cluster-wide operators via OLM, providing the CRDs and controllers the platform layer needs. The playbook is idempotent — safe to re-run.

Operators installed by the infra layer

OperatorNamespaceProvidesVersion
rhbk-operator rhbk-operator Keycloak CRD — Keycloak, KeycloakRealmImport RHBK 26.x (from RHBK channel)
gitea-operator gitea-operator Gitea CRD — Gitea Latest from community channel
openshift-gitops-operator openshift-operators ArgoCD CRD — ArgoCD, Application, AppProject OpenShift GitOps 1.14+
ToolHive toolhive-system MCP server lifecycle management Latest stable

Infra layer: what the Ansible looks like

The infra layer is implemented as a set of AgnosticD workloads, each installing one operator:

# cluster-infra-playbook.yml
---
- name: Install cluster infra operators
  hosts: localhost
  connection: local
  gather_facts: false
  vars:
    ACTION: provision
  tasks:
  - name: Install RHBK operator
    ansible.builtin.include_role:
      name: ocp4_workload_rhbk_operator

  - name: Install Gitea operator
    ansible.builtin.include_role:
      name: ocp4_workload_gitea_operator

  - name: Install OpenShift GitOps operator
    ansible.builtin.include_role:
      name: ocp4_workload_openshift_gitops

  - name: Install ToolHive
    ansible.builtin.include_role:
      name: ocp4_workload_toolhive
Operator installation order matters Install operators before creating instances. The CRDs (Keycloak, Gitea, ArgoCD) are only available after the operators are running. The platform layer assumes all CRDs exist before it runs.

RHBK Operator configuration

# Subscription for RHBK operator (OLM)
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: rhbk-operator
  namespace: rhbk-operator
spec:
  channel: stable-v26
  name: rhbk-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  installPlanApproval: Automatic

Platform Layer — Lab-Specific Cluster-Wide Resources

The platform layer is owned by the lab owner, not ops. It contains cluster-wide resources that are specific to this catalog item — things that do not belong in infra (which is generic) and are not per-user (which belongs in tenant). It is managed by ArgoCD via the bootstrap-platform Application.

Examples across different labs MCP lab: user workload monitoring ConfigMap. Service mesh lab: shared service mesh gateway. AI lab: shared LiteLLM instance. The platform layer is different for every lab — it is whatever cluster-wide setup your specific lab needs before per-user provisioning can work.
What belongs in platform? Anything that must be configured once at the cluster level so that per-order provisioning works correctly. In the MCP lab this is just one thing. Other labs may need more.

What the MCP lab platform layer does

ResourceNamespaceWhat it does
cluster-monitoring-config ConfigMap openshift-monitoring Enables user workload monitoring — allows tenant namespace pods to expose Prometheus metrics. Without this, tenant monitoring routes fail.
# platform/monitoring/monitoring.yaml
# Source: https://github.com/rhpds/ocpsandbox-mcp-with-openshift-gitops/blob/main/platform/monitoring/monitoring.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    enableUserWorkload: true
Note on Gitea and per-tenant instances In the MCP lab, each tenant order gets its own Gitea instance deployed by ocp4_workload_tenant_gitea in the tenant layer — not a shared instance at the platform level. The platform layer only installs the Gitea operator (in infra). Instances are per-tenant by design.

How to Run It Today

Run manually against the cluster using an Ansible playbook. No AgV catalog item exists yet (see future state). You need the cluster API URL and a cluster-admin token.

Prerequisites before running

1 Set your environment variables

# Set cluster credentials (NEVER commit these to git)
export CLUSTER_API_URL="https://api.cluster-xyz.example.com:6443"
export CLUSTER_ADMIN_TOKEN="eyJ..."   # cluster-admin SA token
export LITEMAAS_API_KEY="sk-..."      # LiteMaaS admin key
export GITEA_ADMIN_PASSWORD="generated-password"

2 Create your vars file

# cluster-provision-vars.yml
cluster_admin_agnosticd_sa_token: "{{ lookup('env', 'CLUSTER_ADMIN_TOKEN') }}"
sandbox_openshift_api_url: "{{ lookup('env', 'CLUSTER_API_URL') }}"
sandbox_openshift_ingress_domain: "apps.cluster-xyz.example.com"
sandbox_openshift_console_url: "https://console-openshift-console.apps.cluster-xyz.example.com"

# Keycloak realm configuration
rhbk_realm_name: mcp-lab
rhbk_namespace: keycloak

# Gitea configuration
gitea_namespace: gitea
gitea_admin_user: gitea-admin
gitea_admin_password: "{{ lookup('env', 'GITEA_ADMIN_PASSWORD') }}"

# LiteMaaS
litemaas_api_url: "https://litemaas.example.com"
litemaas_api_key: "{{ lookup('env', 'LITEMAAS_API_KEY') }}"

3 Run the infra playbook

# From agnosticd/ansible/ directory
ansible-playbook cluster-infra.yml   -e @cluster-provision-vars.yml   -e ACTION=provision

# Wait for all operators to be Running before proceeding
oc get pods -n rhbk-operator -n gitea-operator -n openshift-gitops-operator --watch

4 Run the platform playbook

# From agnosticd/ansible/ directory
ansible-playbook cluster-platform.yml   -e @cluster-provision-vars.yml   -e ACTION=provision

# Verify Keycloak is up
oc get keycloak -n keycloak
# Verify Gitea is up
oc get gitea -n gitea
# Verify ArgoCD is up
oc get argocd -n openshift-gitops

5 Register the cluster in Sandbox API

Register in the Sandbox API pool with tags matching your lab's cloud_selector:

# Register cluster with tags that match your lab's cloud_selector
sandbox-ctl cluster register   --api-url "${CLUSTER_API_URL}"   --token "${CLUSTER_ADMIN_TOKEN}"   --tag cloud=cnv-dedicated-shared   --tag demo=mcp-with-openshift   --tag purpose=prod   --tag keycloak=yes   --capacity 50   # max concurrent tenants
Cluster is ready Once registered, the cluster will appear in the Sandbox API pool. Orders with matching cloud_selector tags will land on this cluster. User provisioning can now begin for each new order.

Future: AgV Catalog Item

The goal is a proper AgV catalog item — orderable from the RHDP catalog with full audit trail, lifecycle management, and destroy support.

Current State

  • Manual Ansible playbook run by a developer
  • Credentials passed via environment variables
  • No RHDP catalog item — not orderable by non-developers
  • No automated destroy / deprovision
  • No lifecycle management or renewal

Target State (AgV Catalog Item)

  • Orderable from RHDP catalog by authorized platform engineers
  • Sandbox API provides a "bootstrap" cluster token
  • AgnosticD roles are the same — just invoked via AgV
  • Destroy playbook deprovisions all platform resources
  • Lifecycle managed by RHDP (renewal, expiry, quota)

Same AgV structure as user provisioning — only the workloads: list differs. Roles match exactly what cluster-provision.yml runs today.

---
# ============================================================
# FUTURE: Cluster Provisioner — AgV common.yaml (sketch)
# Ordered ONCE when a new cluster joins the shared pool.
# Same structure as user provisioning — only workloads differ.
# ============================================================

#include /includes/agd-v2-mapping.yaml
#include /includes/sandbox-api.yaml
#include /includes/catalog-icon-openshift.yaml
#include /includes/terms-of-service.yaml
#include /includes/parameters/purpose.yaml

cloud_provider: none
config: openshift-workloads   # runs against an existing cluster

clusters:
  - default:
      api_url: "{{ sandbox_openshift_api_url }}"
      api_token: "{{ cluster_admin_agnosticd_sa_token }}"

requirements_content:
  collections:
  - name: https://github.com/agnosticd/core_workloads.git
    type: git
    version: main
  - name: https://github.com/agnosticd/ai_workloads.git
    type: git
    version: main
  - name: https://github.com/agnosticd/showroom.git
    type: git
    version: v1.5.1

# Cluster-wide operators and shared services — runs ONCE per cluster
workloads:
- agnosticd.core_workloads.ocp4_workload_authentication     # RHBK operator + realm
- agnosticd.core_workloads.ocp4_workload_gitea_operator     # Gitea operator
- agnosticd.core_workloads.ocp4_workload_pipelines          # Tekton
- agnosticd.core_workloads.ocp4_workload_openshift_gitops   # ArgoCD
- agnosticd.core_workloads.ocp4_workload_gitops_bootstrap   # infra + platform AppProjects
- agnosticd.ai_workloads.ocp4_workload_toolhive             # ToolHive
- agnosticd.showroom.ocp4_workload_ocp_console_embed        # Showroom iframe (once)

# No remove_workloads — clusters are retired, not unprovisioned

__meta__:
  deployer:
    scm_url: https://github.com/agnosticd/agnosticd-v2
    scm_ref: main
    execution_environment:
      image: quay.io/agnosticd/ee-multicloud:chained-2025-12-17
  catalog:
    namespace: babylon-catalog-{{ stage | default('?') }}
    display_name: "MCP Shared Cluster Provisioner"
    category: Open_Environments
  sandbox_api:
    actions:
      destroy: {}
  sandboxes:
  - kind: OcpSandbox
    alias: cluster
    namespace_suffix: bootstrap   # scheduling artifact only
    cloud_selector:
      cloud: cnv-dedicated-shared
      demo: mcp-with-openshift
      purpose: prod
    quota:
      limits.cpu: "2"
      requests.cpu: "2"
      limits.memory: 4Gi
      requests.memory: 4Gi

What You Get After Provisioning

Services running after provisioning and their URLs:

ServiceURL patternCredentials
Keycloak admin console https://keycloak.apps.CLUSTER-DOMAIN/admin Stored in secret rhbk-initial-admin in keycloak ns
Keycloak realm (mcp-lab) https://keycloak.apps.CLUSTER-DOMAIN/realms/mcp-lab N/A — this is the OIDC issuer endpoint
Gitea https://gitea.apps.CLUSTER-DOMAIN Gitea admin credentials from platform vars
ArgoCD UI https://openshift-gitops-server-openshift-gitops.apps.CLUSTER-DOMAIN Integrated with OCP OAuth (cluster-admin = ArgoCD admin)
ToolHive Internal only — no public UI Managed via thv CLI or Kubernetes CR
OpenShift console https://console-openshift-console.apps.CLUSTER-DOMAIN Login via RHBK (configured by OAuth patch)
Next step: user provisioning With the cluster provisioned, you can now order the lab from the RHDP catalog. Each order will run the user provisioning playbook, creating a tenant user, namespaces, Gitea repos, and an ArgoCD Application on this cluster. See User Provisioning →

Red Hat Demo Platform (RHDP) — Internal developer reference — GitHub

← Previous: Overview Next: Overview + Scheduler-Only →