Token & Cost Intel

The Token & Cost Intel panel provides visibility into token usage and cost across your fleet.

What You See

Total tokens per model — breakdown of input and output tokens by model
Cost by provider — aggregated cost data across providers
Agent-level consumption — which agents are using the most tokens
Trends over time — token and cost trends with time-series visualization
Snapshots — periodic cost snapshots for tracking spend over time

Understanding Token Usage

Each traffic entry records:

Input tokens — tokens in the prompt
Output tokens — tokens in the response
Total tokens — input + output

Common cost drivers:

Long system prompts (SOUL.md, MEMORY.md loaded into every request)
Multi-turn conversations (previous messages re-sent each turn)
Large code outputs from code generation tasks
Reasoning models (internal thinking tokens count toward output)

Optimization Tips

Right-size models: use lighter models for simple tasks, larger models for complex reasoning
Monitor reasoning tokens on models with reasoning: true
Watch for runaway agents stuck in loops burning tokens rapidly
Set context windows appropriately per model

Agent Workspace Tools & Access