Token & Cost Intel
The Token & Cost Intel panel provides visibility into token usage and cost across your fleet.
What You See
- Total tokens per model — breakdown of input and output tokens by model
- Cost by provider — aggregated cost data across providers
- Agent-level consumption — which agents are using the most tokens
- Trends over time — token and cost trends with time-series visualization
- Snapshots — periodic cost snapshots for tracking spend over time
Understanding Token Usage
Each traffic entry records:
- Input tokens — tokens in the prompt
- Output tokens — tokens in the response
- Total tokens — input + output
Common cost drivers:
- Long system prompts (SOUL.md, MEMORY.md loaded into every request)
- Multi-turn conversations (previous messages re-sent each turn)
- Large code outputs from code generation tasks
- Reasoning models (internal thinking tokens count toward output)
Optimization Tips
- Right-size models: use lighter models for simple tasks, larger models for complex reasoning
- Monitor reasoning tokens on models with
reasoning: true - Watch for runaway agents stuck in loops burning tokens rapidly
- Set context windows appropriately per model