Workshop details
Timing and schedule
Full workshop (1 hour 30 minutes)
-
Module 1: The Agentic App and Why Observability Matters (15 minutes)
-
Module 2: Observability Pillars, Concepts, and Personas (10 minutes)
-
Module 3: Metrics and Logs for Agentic Applications (15 minutes)
-
Module 4: Tracing and MLflow (20 minutes)
-
Module 5: LLM Evaluations (15 minutes)
-
Module 6: From Development to Production (15 minutes)
Technical requirements
Software versions
-
Red Hat OpenShift Container Platform 4.20
-
Red Hat OpenShift AI 3.4
-
MLflow 3.10.1
-
LangGraph/LangChain (latest stable)
-
Grafana/Perses for dashboards
-
Web browser (Chrome, Firefox, Safari, Edge)
Environment setup
Pre-workshop checklist
□ OpenShift access confirmed - Test login credentials at https://console-openshift-console.apps.cluster.example.com (Module 1 includes a guided walkthrough)
□ CLI tools installed - Verify oc client installation
□ Python environment ready - Python 3.11+ with pip
□ Workshop repository cloned - Clone the multi-agent loan app repository
□ Network connectivity verified - Test access to required URLs
Setup validation
Participants should run these commands to verify setup:
# Verify OpenShift CLI
oc version
# Login to OpenShift cluster
oc login --insecure-skip-tls-verify $(oc whoami --show-server) -u user1 -p openshift
# Verify project access
oc project
# Test connectivity to MLflow
curl -s https://mlflow-redhat-ods-applications.apps.cluster.example.com/health | head -5
Troubleshooting guide
Common setup issues
Problem: "error: You must be logged in to the server (Unauthorized)"
→ Solution: Re-run the oc login command with correct credentials. Verify username and password.
Problem: "oc: command not found" → Solution: Download and install the OpenShift CLI from https://mirror.openshift.com/pub/openshift-v4/clients/ocp/stable/openshift-client-linux.tar.gz. Add to your PATH.
Problem: "Unable to connect to the server: dial tcp: lookup api.cluster…" → Solution: Verify network connectivity. Check if VPN is required. Contact workshop facilitator.
Problem: "Permission denied when accessing MLflow UI" → Solution: Ensure you’re logged into OpenShift. MLflow uses OpenShift OAuth for authentication.
During workshop support
-
Encourage participants to help each other
-
Use the bastion host for CLI access if local setup fails:
ssh lab-user@bastion.example.com -
Have backup environments ready for technical difficulties
-
Use screen sharing for complex troubleshooting
Follow-up resources
Next steps for participants
-
Red Hat OpenShift AI Documentation: Official product documentation
-
MLflow Tracing Documentation: Deep dive into GenAI tracing capabilities
-
LangGraph Documentation: Build stateful multi-agent applications
-
Model Context Protocol: MCP specification and tools
Glossary
Key terms used in this workshop (click to expand)
| Term | Definition |
|---|---|
AgentOps |
Agent Operations, the discipline of monitoring, tracing, evaluating, and maintaining AI agent systems in production |
AI Agent |
A system that uses an LLM to reason about tasks, decide which tools to call, and take autonomous actions |
LLM |
Large Language Model, an AI model trained on large text datasets that can generate and understand natural language |
MCP |
Model Context Protocol, a standard for connecting AI agents to external tools and data sources |
LangGraph |
A framework for building stateful, multi-agent AI workflows, built on top of LangChain |
RAG |
Retrieval-Augmented Generation, a pattern that enhances LLM responses by retrieving relevant documents before generating answers |
Trace |
A complete record of a request’s journey through a distributed system, composed of spans |
Span |
A single operation within a trace (e.g., 1 LLM call, 1 tool invocation) |
Scorer |
A function that evaluates the quality of an agent’s response (deterministic or LLM-powered) |
Inner Loop |
Manual, developer-driven evaluation workflow (e.g., running evaluations from a Jupyter notebook) |
Outer Loop |
Automated, platform-driven evaluation workflow (e.g., scheduled AI Pipelines) |
RBAC |
Role-Based Access Control, restricting system access based on user roles |
pgvector |
A PostgreSQL extension that enables vector similarity search for embeddings |
PromQL |
Prometheus Query Language, used to query metrics in Grafana dashboards |
LogQL |
Log Query Language, used to query logs in LokiStack/Grafana Loki |