/aiops-skill:logs-fetcher

๐Ÿ“‹ Log Retrieval

Fetch Ansible/AAP job logs from a remote server via SSH using either a time-range window or specific job numbers, then transfer them locally for analysis.


When to Use

๐Ÿ’ก
Invoke this skill when you want to:
  • Download logs for a known set of job IDs before root cause analysis
  • Fetch all failed jobs from a specific incident time window
  • Retrieve recent processed or ignored job logs for investigation
  • Pull logs by minute or second precision when troubleshooting a narrow time window

Example invocations:

"Fetch logs from the last 2 hours"
"Get logs for jobs 1234567, 1234568, and 1234569"
"Download all processed logs from 2025-12-10 08:00 to 2025-12-10 17:00"
"Fetch the 20 most recent failed job logs"

Prerequisites

๐Ÿ”‘

SSH Access Required

A working SSH profile to the remote log server. The skill uses rsync over SSH, so passwordless key-based auth is expected.

๐Ÿ–ฅ๏ธ

REMOTE_HOST Required

The SSH host alias as defined in ~/.ssh/config. The skill connects to this host to discover and transfer log files.

๐Ÿ“‚

REMOTE_DIR Required

Directory on the remote server where job log files reside (e.g., /var/log/aap/jobs).

โ„น๏ธ
SSH config setup: Add your remote host to ~/.ssh/config for passwordless access: ``` Host log-server HostName logs.example.com User your-username Port 22 IdentityFile ~/.ssh/id_rsa ``` Test with: ssh log-server (should connect without a password prompt)

Two Fetch Modes

Option A โ€” Fetch by Time Range

Use when you know the incident window

  • Filter by start and end time with minute or second precision
  • Combine with --limit to cap the number of files
  • Use --order desc to get newest logs first
  • Supports processed, ignored, or all log types
Option B โ€” Fetch by Job Number

Use when you know specific job IDs

  • Pass one or more job numbers directly
  • Works with or without the job_ prefix
  • Fetches all transform statuses for each job
  • Automatically locates matching files on the remote server

Workflow

  1. Option A โ€” Fetch by Time/Mode

    ```bash # Fetch recent logs (newest first, limit 20) python -m scripts.fetch_logs_ssh \ --mode processed \ --order desc \ --limit 20 \ --local-dir .incidents//raw_logs # Fetch logs in a specific time range python -m scripts.fetch_logs_ssh \ --mode processed \ --start-time "2025-12-09 08:00:00" \ --end-time "2025-12-10 17:00:00" \ --local-dir .incidents//raw_logs # Fetch all logs from a single day python -m scripts.fetch_logs_ssh \ --mode all \ --start-time "2025-12-10" \ --end-time "2025-12-10" \ --local-dir .incidents//raw_logs ``` </div> </li>
  2. Option B โ€” Fetch by Job Number

    ```bash # With job_ prefix python -m scripts.fetch_logs_by_job \ job_1234567 job_1234568 job_1234569 \ --local-dir .incidents//raw_logs # Without prefix (both forms work) python -m scripts.fetch_logs_by_job \ 1234567 1234568 1234569 \ --local-dir .incidents//raw_logs ``` </div> </li>
  3. Verify the Transfer

    Check that files were transferred. Note job IDs from filenames (e.g., job_1234567.json.gz.transform-processed) and confirm the time range matches the incident window.

  4. </ol> --- ## Time Filter Format | Format | Example | Precision | |---|---|---| | Full timestamp | `"2025-12-10 14:30:45"` | Second | | Minute precision | `"2025-12-10 14:30"` | Minute | | Day only | `"2025-12-10"` | Day | Time filters can be combined with `--limit` and `--order`: ```bash --start-time "2025-12-10 00:00" --limit 10 --order desc ``` --- ## Configuration | Variable | Purpose | Example | |---|---|---| | `REMOTE_HOST` | SSH host alias | `log-server` | | `REMOTE_DIR` | Remote log directory | `/var/log/aap/jobs` | | `DEFAULT_LOCAL_DIR` | Default local output directory | `~/aiops_extracted_logs` | --- ## Next Steps Once logs are fetched locally, run root cause analysis: ``` "Analyze job 1234567 for root cause" ``` The root-cause-analysis skill automatically finds logs in `JOB_LOGS_DIR` and proceeds through the 5-step pipeline. ---