Before Agent Playground existed, configuring an AI agent at an enterprise meant filing a request and waiting for an engineer. I built the product from zero — a complete agent assembly and testing environment that lets both technical and non-technical users build production-ready agents themselves.
Cortx had AI agents. Enterprise clients wanted to use them. The gap was everything in between — no environment to interact with an agent, understand its behaviour, adjust its configuration, or validate it before production.
Two completely different users. The same root problem: no place to build, test, and trust an agent before it goes live.
An agent's behaviour is a function of everything that goes into it. A playground that only shows chat output isn't a playground — it's a demo. I defined Agent Playground as a five-layer assembly environment. Explore each layer below.
Users select and switch between the underlying AI models powering the agent. Different models have different strengths — response quality, speed, cost, context window. The playground makes that comparison possible without an engineering ticket.
Most enterprise tools are built for one type of user and quietly ignore the other. I defined Agent Playground to serve two distinct users on the same underlying framework — different entry points, not different products.
They know what they want the agent to do but have never thought about models, tools, or knowledge sources as separate configurable layers. They needed progressive disclosure — the most important controls first, advanced configuration available but not imposed.
They know what they're doing. They don't need guidance through every layer — they want direct access to configuration, fast. An advanced mode that bypasses the guided flow and lets them work directly on the assembly environment.
The hardest product decision I made touched every other decision: how much control do you give users, and where do you put the limits? Drag the slider to explore what each extreme looked like — and why neither was right.
Click each decision to reveal the reasoning behind it.
The minimal version was a chat window — simple, shippable, defensible. A chat window shows you the output of an agent — not the system producing it. If a user sees something they don't like, they have no way to understand why it happened or what to change.
I could have made version saving an explicit action — a button the user clicks. Simpler to implement, cleaner interface. But non-technical users don't think in versions. They configure, test, adjust, configure again. If saving requires a deliberate action, most users never do it.
When an agent produces a bad output, the system could respond two ways: show an error, or show an explanation. An error tells the user something went wrong. An explanation tells them what went wrong and what to change.
Good product decisions aren't just about what you ship. They're about what you choose not to — and being clear-eyed about why.
Agent Playground didn't produce a single headline metric. What it produced was a shift in how enterprise clients related to their AI agents — from passive consumers of engineering output to active builders of their own tools.