Agent Architecture

The RHDP Skills Marketplace follows the skill-as-orchestrator, agent-as-worker pattern β€” the same design used by the FTL plugin.


Core Principle

Skills handle user interaction and coordination. Agents do the actual work in parallel, each with a fresh context window.

RHDP Skills Marketplace agent architecture overview

Why agents instead of inline checks?

Old approach Agent approach
All checks in one context window Each agent gets a fresh context β€” no saturation
Sequential β€” each check waits for the previous Parallel β€” N modules reviewed simultaneously
8 minutes for a 6-module lab ~90 seconds (6Γ— faster)
Model sees everything β€” can’t score dimensions Each agent returns structured JSON β€” eval-ready

All Agents

Showroom Plugin

Agent Model Purpose Called by
showroom:scaffold-checker Haiku Checks site.yml, ui-config.yml, antora.yml, gh-pages.yml verify-content
showroom:module-reviewer Sonnet Reviews one .adoc module β€” B/C/D/E/F checks + dimension scores verify-content, create-lab, create-demo
showroom:file-generator Sonnet Generates one AsciiDoc file (or Markdown blog) from spec create-lab, create-demo, blog-generate
showroom:score-aggregator Haiku Aggregates dimension scores, detects regressions vs baseline showroom:eval (future)
showroom:doc-writer Sonnet Generates/updates GitHub Pages docs from SKILL.md any skill after changes

FTL Plugin (reference pattern)

Agent Model Purpose
ftl:content-reader Sonnet Reads .adoc, extracts tasks and classifies steps
ftl:solve-writer Sonnet Writes solve.yml from content-reader output
ftl:validate-writer Sonnet Writes validate.yml playbooks
ftl:env-connector Sonnet Pushes to live showroom, runs test cycle

AgnosticV Plugin

Agent Model Purpose
agnosticv:workflow-reviewer Sonnet Checks builder/validator skill consistency

How Skills Spawn Agents

Skills use the Task tool with subagent_type:

Task tool:
  subagent_type: showroom:module-reviewer
  prompt: |
    MODULE_FILE: /path/to/03-module-01.adoc
    CONTENT_TYPE: workshop
    LAB_TYPE: ocp
    SHARED_CONTEXT: {"module_order": [...], "defined_attributes": {...}}
    REPO_PATH: /path/to/repo

All agents return structured JSON only β€” no prose, no tables. The orchestrating skill handles presentation.


verify-content Orchestration

verify-content orchestration diagram

create-lab Orchestration

create-lab orchestration diagram


Model Assignments

Tier Model Used for
Orchestrators Sonnet 4.6 All skills β€” user interaction + coordination
Generation/review agents Sonnet 4.6 module-reviewer, file-generator, doc-writer, workflow-reviewer
Reading/computation agents Haiku 4.5 scaffold-checker, score-aggregator
FTL agents Sonnet 4.6 content-reader, solve-writer, validate-writer, env-connector

Skill Evaluation Framework

One of the most important aspects of this architecture is that agent outputs are structured JSON with dimension scores β€” making the skills measurable and comparable across versions.

The Challenge

Evaluating content creation skills is hard:

The agnosticv:validator is the easy case β€” formal rules, deterministic output, pass/fail clear. Evaluating create-lab is genuinely hard.

What the Score-Aggregator Enables

The showroom:score-aggregator (Haiku) receives module-reviewer JSON outputs and produces:

{
  "dimensions": {
    "structure":      {"current": 0.88, "baseline": 0.85, "delta": "+0.03"},
    "pedagogy":       {"current": 0.72, "baseline": 0.78, "delta": "-0.06 ⚠️"},
    "style":          {"current": 0.94, "baseline": 0.93, "delta": "+0.01"},
    "intro_quality":  {"current": 0.91, "baseline": 0.84, "delta": "+0.07"}
  },
  "verdict": "REGRESSION β€” pedagogy degraded by 0.06 on OCP labs",
  "lab_type": "ocp"
}

This catches the case where β€œintro_quality improved but pedagogy regressed on OCP labs” β€” per-dimension, per-lab-type regression detection that no single score could surface.

The Pioneer Opportunity

Skill evals for content generation skills don’t exist in the industry at this level. The patterns here β€” structured JSON agents, dimension scoring, golden datasets, lab_type tagging β€” could be applied beyond RHDP:

The agnosticv:validator provides the immediate proof: formal output schema, automated scoring, no ambiguity. The generative skills (create-lab, create-demo) are harder but the architecture makes it possible for the first time.

See Writing Style Guide for how personal style profiles interact with the eval framework.