Observability

The observability stack the analyzers ride on

TokenJam isn't only a cost-optimization tool — it's a full local-first observability tool for AI agents. Traces, drift, alerts, budgets, schema validation, and a web UI. The four analyzers ride on top of this substrate. Free, open-source, MIT-licensed.

What's included

Six capabilities. All shipping in the open-source CLI today.

  • Real-time cost tracking

    Every LLM call is priced as it happens using TokenJam's local pricing table. Spend breaks down by agent, model, session, and tool. Budget alerts fire before you hit the limit, not after.

    tj cost --since 7d
  • Trace waterfalls

    Every session is captured as a full OpenTelemetry span tree. See which tools ran, in what order, with what arguments, and how long each step took, in the local web UI or the CLI.

    tj traces
  • Sensitive-action alerts

    Configure any tool call as a sensitive action (send_email, delete_file, submit_form) and get notified instantly. 13 alert types, 6 channels: ntfy push (free, phone-friendly), Discord, Telegram, webhook, file, stdout.

    tj alerts
  • Behavioral drift detection

    TokenJam builds a Z-score baseline from your agent's real behavior (token counts, tool sequences, output shapes). When something drifts (a prompt tweak, a model update, a dep bump), you get a drift_detected alert at session end. No LLM-on-LLM evaluation required.

    tj drift
  • Schema validation

    Declare a JSON Schema for any tool's output or let TokenJam infer one from a few sessions. Schema violations are caught at ingest and surface as schema_violation alerts.

    tj tools
  • Local web UI + REST API

    `tj serve` runs a local dashboard at 127.0.0.1:7391 with status, traces, cost breakdown, alerts, budget, and drift. Prometheus metrics at /metrics. No cloud, no signup, runs entirely on your machine.

    tj serve

It's also what powers the analyzers

Each analyzer needs a particular shape of telemetry. The observability stack is what collects it — so the analyzers can run against your real usage instead of a synthetic benchmark.

Local-first, by design.

Your spans contain prompts, completions, tool inputs, and customer data. Shipping that to a SaaS observability vendor is a data-egress decision most teams aren't ready to make. TokenJam captures, stores, and analyzes everything on your machine — DuckDB on disk, REST API on 127.0.0.1, no telemetry leaving by default.

When you do want to forward telemetry — to Grafana, Datadog, an OTLP collector — tj export ships it on your terms.