TokenJam Trim

Your system prompt grew over six months. Half of it isn't doing work.

Identify token waste in your system prompts.

The problem

Prompts accumulate. Every new edge case adds an instruction. Every project picks up a CLAUDE.md that gets longer. Tool schemas repeat across calls.

The actual signal in a 4,000-token system prompt might be 800 tokens of real instructions and 3,200 tokens of historical scar tissue. You pay for the whole thing on every call. Trim runs significance analysis on your captured prompts and shows you which sections carry the load and which are dead weight.

How it works

Trim runs LLMLingua-2's token-classification model — BERT-class, MIT-licensed, runs locally on CPU — over your captured prompts. Each token gets a score reflecting its contribution to model outputs.

Sections of consistently low-significance tokens get flagged as bloat candidates. The output is a highlighted view of your prompt with high-significance regions in bold and low-significance regions dimmed. You decide what to remove; Trim never edits your prompts at runtime.

Confidence levels

Every finding carries an explicit confidence level. TokenJam never claims a smaller model would have produced an identical answer; it shows the candidates with evidence, and you decide what to apply.

Level 1 OSS

Structural

Token significance is mathematical, not a quality judgment. We recommend; you trim by hand. We never auto-compress prompts at runtime.

Example output

Verbatim from a real run against a real Claude Code project. No screenshots, no cherry-picks.

tj optimize --include-bloat
Prompt bloat detected in claude-code-myproj:  • Your CLAUDE.md is 4,213 tokens (up 38% in 30 days)  • Section "Coding conventions > Error handling" appears identically    in 91 of 247 sessions (1,108 tokens × 91 = ~100K repeated tokens)  • Significance analysis suggests ~340 of those 1,108 tokens carry    the signal; the rest could be trimmed  • Estimated cost: ~$8.50/mo at current usage on Sonnet   Detail: open `tj report --bloat claude-code-myproj` to see the  highlighted prompt with high-significance tokens bold,  low-significance dimmed.

What you do with it

Recommendations land in your existing tools — terminal, MCP-capable agent, or as an exportable config.

  • CLI
    tj optimize --include-bloat
  • Report
    tj report --bloat <agent_id>

    opens a local HTML file with the highlighted prompt

  • MCP
    surfaces in get_optimize_report when content capture is enabled

The research behind it

  • LLMLingua-2

    Microsoft Research — ACL 2024

    Token classification via GPT-4 distillation. 3–6× faster than LLMLingua-1. We use the same scoring mechanism for detection only — leaving the editing decision with you.

Trim is in the open-source CLI. Install once, analyze everything.