Module 3 Lab 3: Events - Troubleshooting Event-Driven Ansible
When running an agentless url_check lab, the most common issues arise from network visibility or rulebook syntax. Use the following steps to debug the environment using Ansible ad-hoc commands and platform logs.
1. Verify Network Reachability
If the EDA Rulebook never reports a "down" or "up" status, the EDA Controller container may be unable to reach the Windows host. Use an ad-hoc command to test connectivity from the automation environment.
ansible --module-name ansible.windows.win_shell windows \
--args "Get-NetTCPConnection -LocalPort 80" \
--extra-vars @/projects/env/secrets.yml
2. Inspect EDA Activation Logs
The Rulebook Activation logs provide a real-time stream of the polling attempts.
-
In EDA Controller, navigate to Rulebook Activations.
-
Select your running activation and click the History tab.
-
Look for the
url_checkoutput. A successful poll will look like this:{ "url_check": { "url": "http://windows.aap.svc.cluster.local", "status": "up", "status_code": 200 } }Edit the Rulebook Activation and change the Log level to Debugand restart the Rulebook Activation
3. Validate Service State via Ad-Hoc
If the remediation job runs but the site remains "down", use an ad-hoc command to check if the IIS service is actually in a Running state or if it is stuck in StartPending.
ansible --module-name ansible.windows.win_service windows \
--args "name=W3SVC" \
--extra-vars @/projects/env/secrets.yml
4. Common Resolution Steps
-
EDA Activation is stuck: Restart the Rulebook Activation to clear the internal event queue.
-
Remediation Job Fails: Check the Jobs tab in Automation Controller. Ensure the
Windows Admin Credentialhas not expired. -
URL Check doesn’t trigger "down": Ensure the
delayin the rulebook isn’t set too high (e.g., 60 seconds), which might cause a lag in detection.