Conclusion
What You’ve Accomplished
You’ve deployed and operated a production-ready AI agent system on Red Hat AI. Through this hands-on workshop, you:
-
Became an expert at Llama Stack and how to add new providers and extend and enhance your agent’s functionality through the Llama Stack framework.
-
Deployed an agentic application using Llama Stack with multi-tool capabilities
-
Configured observability with OpenTelemetry distributed tracing
-
Analyzed performance using trace data to identify bottlenecks and optimize costs
Key Technical Takeaways
Production AI Requires Much More Than Models and Agent Apps
The agent itself was pre-built, but making it production-ready required:
-
Tool integration for real-world capabilities (OpenShift API, web search, GitHub)
-
Safety controls to prevent unintended actions
-
Observability infrastructure to understand what’s happening in production
-
GitOps workflows for repeatable, version-controlled deployments
Observability is Critical
Without proper observability and tracing, you’re flying blind. The traces revealed:
-
Where time is actually spent (tool execution vs. inference vs. overhead)
-
Token consumption patterns that drive costs
-
Potential security issues (sensitive data in traces)
-
Opportunities for optimization
Resources
-
Red Hat AI: https://docs.redhat.com/en/documentation/red_hat_ai/3
-
Llama Stack: https://github.com/llamastack/llama-stack
-
Model Context Protocol: https://modelcontextprotocol.io
-
OpenTelemetry: https://opentelemetry.io