Installation & Verification

Duration: 25 minutes
Format: Hands-on verification

What You’re Checking

These are the same checks you’d run after an installation, after an upgrade, or when something feels off - ClusterOperators, node config, networking, storage, and etcd health.

Your cluster was provisioned using IPI (Installer-Provisioned Infrastructure), which automates compute, network, storage, and DNS. OpenShift also supports UPI, Agent-based/Assisted Installer for bare metal, and managed offerings (ROSA, ARO). See the Installation Documentation for details.

Two things that cannot be changed after installation: the pod network CIDR and the service network CIDR. If they overlap with your corporate network, you’re rebuilding the cluster.

Cluster Health

Step 1: Verify Cluster Operators (Primary Health Check)

ClusterOperators are the single source of truth for cluster health. Each operator manages a specific cluster component and reports its status.

View all cluster operators:

oc get clusteroperators

You’ll see ~34 operators. Example output:

NAME                         VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication               4.20.x    True        False         False      3h
cloud-credential             4.20.x    True        False         False      4h
console                      4.20.x    True        False         False      3h
dns                          4.20.x    True        False         False      4h
etcd                         4.20.x    True        False         False      4h
ingress                      4.20.x    True        False         False      3h
kube-apiserver               4.20.x    True        False         False      4h
kube-controller-manager      4.20.x    True        False         False      4h
kube-scheduler               4.20.x    True        False         False      4h
machine-api                  4.20.x    True        False         False      4h
machine-config               4.20.x    True        False         False      4h
monitoring                   4.20.x    True        False         False      3h
network                      4.20.x    True        False         False      4h
openshift-apiserver          4.20.x    True        False         False      3h
storage                      4.20.x    True        False         False      4h

What to check:

  • AVAILABLE = True - Component is functional

  • PROGRESSING = False - No upgrade/rollout in progress

  • DEGRADED = False - Component is healthy

Healthy cluster: All operators show True False False

If an operator is degraded, investigate further:

oc describe clusteroperator <operator-name>

Check the Conditions section for details about why it’s degraded.

Common issues:

  • machine-config progressing - Nodes are being updated (normal during updates)

  • monitoring degraded - Storage or resource issues

  • authentication degraded - OAuth or LDAP configuration issues

Console view: Switch to the OCP Console tab and navigate to Administration → Cluster Settings → ClusterOperators. The console shows each operator with green/red status icons - degraded operators are immediately visible without scanning a 34-row table. This is faster than CLI for a quick health check.

ClusterOperators in web console

Step 2: Check Cluster Version and Update Status

View the current cluster version:

oc get clusterversion

Your output will look similar to:

NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.20.x    True        False         3h      Cluster version is 4.20.x

Check which update channel you’re subscribed to:

oc get clusterversion version -o jsonpath='{.spec.channel}'

Common channels:

  • stable-4.20 - Production-ready releases (recommended)

  • fast-4.20 - Early access to stable releases

  • eus-4.20 - Extended Update Support (14-month lifecycle)

  • candidate-4.20 - Pre-release testing

View available updates:

oc adm upgrade

If updates are available, you’ll see a list of recommended versions. If your cluster is already up to date, you’ll see:

No updates available. You may still upgrade to a specific release image
with --to-image or wait for new updates to be available.

In the OCP Console, navigate to Administration → Cluster Settings. The Details tab shows the current version, update channel, and - when updates are available - a visual upgrade path graph showing the route from your current version to available targets. This is far more informative than the CLI text output for planning upgrades.

Cluster Settings Details tab showing version and update channel

Key takeaway: Operators manage cluster updates. When updates are available, you can upgrade with a single command: oc adm upgrade --to=<version>

Step 3: Verify Node Configuration

Beyond the node health you explored in the Platform Overview, there’s an important verification step specific to post-installation: MachineConfigPools.

Check node configuration status:

oc get machineconfigpool

Your output will look similar to:

NAME     CONFIG             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT
master   rendered-master-…  True      False      False      3              3
worker   rendered-worker-…  True      False      False      0              0
On this compact cluster the worker pool shows 0 machines because all nodes serve both control-plane and worker roles under the master pool.

What this shows:

  • UPDATED = True - All nodes have latest configuration

  • UPDATING = True - Nodes are being rolled out with new config (normal during changes)

  • DEGRADED = True - Configuration failed on some nodes (investigate)

MachineConfigPools manage node-level configuration (kernel arguments, systemd units, etc.) and coordinate rolling updates.

You can also verify this in the console. Navigate to Compute → MachineConfigPools:

MachineConfigPools showing master and worker pools with Up to date status

The green Up to date indicators and Degraded: False confirm that all node configurations have been applied successfully. During an upgrade or configuration change, you would see the Update status change to Updating - this view is a quick way to monitor node rollout progress without polling the CLI.

Step 4: Test Networking (Hands-on)

Verify networking works by creating a test application and route:

oc new-project install-test 2>/dev/null || oc project install-test
oc new-app --name=test-app --image=registry.access.redhat.com/ubi9/httpd-24 -n install-test

Wait for the pod to be running, then create a route:

oc rollout status deployment/test-app -n install-test --timeout=60s
oc create route edge test-app --service=test-app -n install-test
This may take a minute to complete. Only press Ctrl+C if it has been running for more than 5 minutes.

Test the route:

curl -sk https://$(oc get route test-app -n install-test -o jsonpath='{.spec.host}') | head -3

What this proves: DNS works, router is functional, TLS certificates auto-generated, and the application is reachable from outside the cluster.

You can also verify the route in the console at Networking > Routes (project: install-test) - click the route URL to test it directly from the browser.

Clean up:

oc delete project install-test &>/dev/null &

Storage & etcd

Step 5: Test Storage Provisioning

Verify dynamic storage works by creating a PVC and a pod that mounts it:

oc new-project storage-test 2>/dev/null || oc project storage-test
cat <<EOF | oc apply -n storage-test -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: test-storage
spec:
  containers:
  - name: test
    image: registry.access.redhat.com/ubi9/ubi-minimal
    command: ["/bin/sh", "-c", "echo 'Storage works!' > /data/test.txt && cat /data/test.txt && sleep 30"]
    volumeMounts:
    - name: data
      mountPath: /data
    securityContext:
      allowPrivilegeEscalation: false
      runAsNonRoot: true
      capabilities:
        drop: ["ALL"]
      seccompProfile:
        type: RuntimeDefault
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: test-pvc
  restartPolicy: Never
EOF

Wait for the PVC to bind and the pod to start:

echo "Waiting for storage test to complete..."
ELAPSED=0
until oc get pod test-storage -n storage-test --no-headers 2>/dev/null | grep -qE 'Running|Completed'; do
  sleep 5; ELAPSED=$((ELAPSED+5))
  [ $ELAPSED -ge 120 ] && echo "ERROR: Timed out - check 'oc get pods -n storage-test'" && break
done
oc get pvc test-pvc -n storage-test && oc get pod test-storage -n storage-test

The PVC should show Bound and the pod should be Running or Completed:

NAME       STATUS   VOLUME        CAPACITY   ACCESS MODES   STORAGECLASS                           AGE
test-pvc   Bound    pvc-xxxxxxx   1Gi        RWO            ocs-external-storagecluster-ceph-rbd   15s

NAME           READY   STATUS    RESTARTS   AGE
test-storage   1/1     Running   0          15s
In the console, navigate to Storage > PersistentVolumeClaims (project: storage-test) to see the PVC status and its bound PersistentVolume.

Check the pod wrote to the volume successfully:

oc logs test-storage -n storage-test

You should see Storage works! - confirming that dynamic provisioning, volume attachment, and read/write all function correctly.

Clean up:

oc delete project storage-test &>/dev/null &

Step 6: Verify etcd Health

etcd is the distributed key-value store that holds all cluster state. If etcd is unhealthy, everything is unhealthy. Two key metrics to check after installation:

  • Leader elections - should be 0 or very low. Frequent leader elections indicate disk I/O problems or network latency between control plane nodes.

  • Disk sync duration - if fsync times are consistently above 10ms, etcd performance will suffer and oc commands will start timing out.

Check both via the OCP Console at Observe → Metrics. Enter this query and click Run Queries:

increase(etcd_server_leader_changes_seen_total[24h])

You should see 0 elections across all etcd members - that’s healthy.

Now check disk sync times:

histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m]))

This shows the 99th percentile WAL fsync latency per etcd member. Values under 10ms are ideal. Workshop clusters may show higher values due to shared storage - that’s expected.

Also verify the monitoring operator is healthy:

oc get clusteroperator monitoring
NAME         VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
monitoring   4.20.x    True        False         False      3h

Summary

Installation: You learned how OpenShift can be installed using full stack automation (IPI, Agent-based, Assisted Installer) or pre-existing infrastructure (UPI) across public cloud, private cloud, bare metal, and specialized platforms.

Verification: You verified this cluster is production-ready by checking ClusterOperators, update channels, MachineConfigPools, networking, storage provisioning, and etcd health.

Key takeaway: oc get clusteroperators is your primary health check. All operators showing True False False means a healthy cluster ready for workloads.

Additional Resources