Why traces matter
When an agent gives a wrong answer or makes a bad decision, traces tell you exactly why. You can see:- Which agent was active at each point.
- What tools were called and what they returned.
- What the LLM was thinking (reasoning summaries).
- When and why handoffs occurred.
- Token usage and response times per step.
- The full input/output of every LLM call.
Viewing traces
- Go to Traces in the sidebar.
-
Browse the list of recent traces, or filter by:
- Workflow type (Storefront MAS, Email MAS, etc.)
- Date range
- Group ID or Trace ID
- Click a trace to open the detail view.
Trace list
Each trace in the list shows:| Column | Description |
|---|---|
| Workflow | The MAS type that generated this trace (Storefront, Email, Instagram, Facebook). |
| Flow | The chain of agents involved (e.g., Triage > Orders > Refund). |
| Execution time | Total time from start to finish. |
| Tokens | Total tokens consumed across all LLM calls. |
| Group ID | Links traces to their conversation/session. All traces from the same conversation share a Group ID. |
| Created at | When the execution started. |
Trace detail
The detail view shows a hierarchical span tree: a visual representation of everything that happened during the execution.Span types
| Span type | Icon | What it represents |
|---|---|---|
| Agent | Component icon | An agent’s turn in the execution. Shows which agent was active, what tools it had, and what handoffs were available. |
| Function | Wrench icon | A tool call. Shows the input parameters and output result. For MCP tools, also shows the server name. |
| Generation | Brain icon | An LLM call. Shows the input messages, output, model used, token usage, and configuration (tool choice, reasoning effort, verbosity). |
| Response | Message icon | The API response from the LLM provider. Contains the raw request and response data, including the response ID. |
| Handoff | Arrow icon | A handoff from one agent to another. Shows the source and target agents. |
Reading a trace
Traces are displayed as a tree. The root is usually an Agent span (your entry agent). Inside it, you’ll see:- Generation spans: The LLM calls the agent made.
- Function spans: Tools the agent called (with inputs and outputs).
- Handoff spans: If the agent handed off, you’ll see the target agent.
- Nested agent spans: The child agent’s execution, with its own generations, functions, and potential handoffs.
Span details
Click any span to see its full details:- Agent spans: Name, available tools, handoffs, output type.
- Function spans: Tool name, input parameters, output data, MCP server info (if applicable).
- Generation spans: Full input/output messages, model name, model settings (tool choice, reasoning effort, reasoning summary, verbosity), token usage breakdown, and the response format type.
- Response spans: Response ID (copyable), model, token usage with detailed breakdown, functions called, web search calls, and configuration details.
- Handoff spans: From agent to target agent.
- Error info: If a span failed, the error message and detailed error data.
What you see in Generation spans
Generation spans are the most information-rich. They show:- Input: All messages sent to the LLM, including function call results, web search results, and reasoning blocks.
- Output: The LLM’s response, including any function calls it decided to make, reasoning blocks, and text output.
- Configuration: Model name, tool choice setting, reasoning effort, reasoning summary level, and verbosity.
- Token usage: Input tokens, output tokens, and total, with a detailed breakdown popup.
Using traces for debugging
Agent gave the wrong answer?- Find the trace for that conversation.
- Look at the Generation spans to see what the LLM received as input and what it produced.
- Check if the instructions are clear enough, or if the agent needs more context.
- Check the Agent span to confirm the tool was assigned.
- Look at the Generation span to see if the LLM considered using it.
- Improve the tool description or add explicit instructions about when to use it.
- Find the Handoff span to see which agent was chosen.
- Review the handoff descriptions of all candidate agents.
- Make descriptions more specific to help the LLM distinguish between them.
- Check the timeline bars to see which spans took the longest.
- Look at Function spans for slow tool calls (especially external API tools).
- Check Generation spans for slow LLM calls (reasoning models are slower).
- Use MAS Stats for aggregate performance data.
AI Generated badge in the Inbox
Every AI-generated message in the Inbox is tagged with an “AI Generated” badge (shown as dotted-underlined text next to the timestamp). Hover over it to see a trace hover card: a compact summary of the MAS execution that produced that message, including:- The MAS name and execution time.
- A mini span tree showing agents, tool calls, handoffs, and response times.
- Timeline bars for each span so you can spot bottlenecks at a glance.
- A View link that opens the full trace detail page.

