Dashboard GuideToken & Cost Intel

Token & Cost Intel

The Token & Cost Intel panel provides visibility into token usage and cost across your fleet.

What You See

  • Total tokens per model — breakdown of input and output tokens by model
  • Cost by provider — aggregated cost data across providers
  • Agent-level consumption — which agents are using the most tokens
  • Trends over time — token and cost trends with time-series visualization
  • Snapshots — periodic cost snapshots for tracking spend over time

Understanding Token Usage

Each traffic entry records:

  • Input tokens — tokens in the prompt
  • Output tokens — tokens in the response
  • Total tokens — input + output

Common cost drivers:

  • Long system prompts (SOUL.md, MEMORY.md loaded into every request)
  • Multi-turn conversations (previous messages re-sent each turn)
  • Large code outputs from code generation tasks
  • Reasoning models (internal thinking tokens count toward output)

Optimization Tips

  • Right-size models: use lighter models for simple tasks, larger models for complex reasoning
  • Monitor reasoning tokens on models with reasoning: true
  • Watch for runaway agents stuck in loops burning tokens rapidly
  • Set context windows appropriately per model