/health:deployment-validator

๐Ÿฅ Deployment Health Validation

Create Ansible validation roles that verify every component of your RHDP deployment is healthy โ€” pods running, routes accessible, operators installed, per-user resources correctly provisioned.


What Youโ€™ll Need Before Starting

deployment-validator workflow diagram

Click to view full workflow diagram

Prerequisites

โœ“

Know What to Validate

  • List of packages to verify
  • Services that should be running
  • Configuration files to check
  • Expected OpenShift resources
  • API endpoints to test
๐Ÿš€

Have Workload Deployed

# Know your deployment details:
- OpenShift namespace
- Deployed applications
- Required resources
๐Ÿ“

AgnosticD Repository Access

cd ~/work/code/agnosticd

What Youโ€™ll Need

๐Ÿท๏ธ

Workload Name

Matches your deployment workload

๐Ÿ“‹

Validation Checks

List of checks to perform

โœ…

Expected State

Expected state for each check

โŒ

Failure Conditions

Failure conditions and error messages


Quick Start

  1. Navigate to Repository

    Open your AgnosticD repository directory

  2. Run Validator

    /health:deployment-validator

  3. Answer Questions

    Provide validation requirements

  4. Review & Test

    Review generated role and test it


What It Creates

Generated role in your Ansible collection:

{collection}/roles/ocp4_workload_{workshop}_validation/
โ”œโ”€โ”€ defaults/main.yml              # Component toggles + settings
โ”œโ”€โ”€ tasks/
โ”‚   โ”œโ”€โ”€ main.yml                   # Orchestrates all checks
โ”‚   โ”œโ”€โ”€ check_keycloak.yml         # Shared Keycloak namespace
โ”‚   โ”œโ”€โ”€ check_aap_instances.yml    # Per-user loop
โ”‚   โ”œโ”€โ”€ check_single_aap_instance.yml
โ”‚   โ”œโ”€โ”€ check_showroom_instances.yml
โ”‚   โ”œโ”€โ”€ check_single_showroom.yml
โ”‚   โ””โ”€โ”€ generate_report.yml        # Results to agnosticd_user_info
โ””โ”€โ”€ playbooks/
    โ””โ”€โ”€ validate_{workshop}.yml    # Bastion test playbook

Common Validation Types

Package Validation

Verify RPM packages are installed:

- name: Verify package is installed
  package:
    name: ""
    state: present
  check_mode: yes

Service Validation

Check systemd services are running:

- name: Verify service is running
  systemd:
    name: ""
    state: started
    enabled: yes

OpenShift Resource Validation

Verify pods, deployments, routes:

- name: Verify deployment is ready
  kubernetes.core.k8s_info:
    kind: Deployment
    name: ""
    namespace: ""

Tips & Best Practices

๐ŸŽฏ Start Simple

Begin with basic checks first

๐Ÿ’ฌ Clear Messages

Use clear error messages

๐Ÿงช Test Thoroughly

Test on clean deployment

๐Ÿ“ Document Checks

Document what each check verifies

๐Ÿ”’ Read-Only

Validation should not modify state

โฑ๏ธ Add Retries

Resources take time to be ready


Troubleshooting

Skill not found?
  • Restart Claude Code or VS Code
  • Verify installation: ls ~/.claude/skills/deployment-health-checker
  • Check the Troubleshooting Guide
Validation fails on working deployment?
  • Check timing - resources take time to be ready
  • Add retries with delays
  • Verify variable values are correct
  • Use debug mode to inspect actual vs expected state