TokenJam Reuse

Stop paying to re-plan work your agent already figured out.

Find planning your agent keeps redoing and pays for every time.

The problem

Most agent work has a shape. "Cut a patch release" runs the same five steps every time; only the version number changes. The first time, the agent earns its keep by reasoning out the plan. Every time after that, you pay full price to regenerate a plan that did not change.

Across a month of real traffic, that repetition adds up. The published research on agent plan caching puts the planning stage at roughly half of total agent cost on repetitive workloads, recoverable at around 50% with no loss in task success. Reuse is how you find that money in your own traces.

How it differs

Reuse is not Cache.

Cache finds prompt prefixes the provider can serve cheaper; the call still happens, you just pay less for the input. Reuse finds whole plans you can skip regenerating; the planning call does not happen at all.

Reuse is not Script.

Script finds work the agent should not be doing: fixed sequences with no decisions, better off as a shell script. Reuse finds work the agent should be doing but keeps re-planning from scratch: same skeleton, different details each time. Script removes the agent. Reuse keeps it and stops it from re-planning.

How it works

Cluster by plan shape. Reuse groups your completed sessions by the structure of the work (the tool sequence and the plan skeleton), using the same machinery Script uses to find deterministic clusters.

Find the repeats. Within each cluster it isolates the planning portion and measures what stays the same across runs versus what varies. A stable skeleton with changing parameters is a reuse candidate.

Quantify and export. It prices what you spent regenerating each repeated plan and shows the recoverable figure, then exports the skeletons as templates you can review and reuse however you like.

Confidence levels

Every finding carries an explicit confidence level. TokenJam never claims a smaller model would have produced an identical answer; it shows the candidates with evidence, and you decide what to apply.

Level 1 OSS

Structural

Ships today. Clusters your sessions by plan shape, isolates the planning portion, and quantifies what you spent regenerating each repeated skeleton. Exports the templates via `tj report --reuse` so you can review and reuse them by hand.

Level 2 Pro

Replay-validated

Samples a flagged cluster and replays the planning step through a lightweight adapter to confirm the cached skeleton still produces an equivalent tool-call sequence. Recoverable estimates now carry evidence from your own data, not just a clustering heuristic.

Level 3 Pro

User-validated

Tracks which served plans you accepted versus let regenerate. Promotes high-confidence repeats to default-served, with full audit log and instant kill switch. Powers "Reuse Live": a local plan cache that recognizes a repeated task and serves the adapted plan automatically, scoped to your codebase.

Example output

Verbatim from a real run against a real Claude Code project. No screenshots, no cherry-picks.

tj optimize — reuse

Repeated planning detected (last 30d):   • Cluster "patch-release": 31 sessions share one plan skeleton    (read changelog → bump version → run tests → tag → push)  • Planning portion: ~2,100 tokens/session on claude-opus-4-7  • The skeleton is identical across all 31 runs; only the version    string and date change.  • You paid to generate this plan 31 times.  • Estimated recoverable: ~$54/mo if served from a plan cache    (or ~$61/mo if converted to a slash command)      ⚠ Structural analysis only. A matching skeleton does not guarantee       the plans were interchangeable. Review the exported templates       in `tj report --reuse` before reusing them.

What you do with it

Recommendations land in your existing tools — terminal, MCP-capable agent, or as an exportable config.

CLI
tj optimize reuse
Report
tj report --reuse
exports each skeleton as a reviewable template
MCP
find_reuse_candidates
query from inside any MCP-capable agent

The research behind it

Agentic Plan Caching

NeurIPS 2025

Extracting reusable plan templates from completed agent runs and adapting them with a small model recovers roughly half of agent cost while holding task success near baseline. Reuse brings that finding to your own traces.

Reuse is in the open-source CLI. Install once, analyze everything.

Get started GitHub