Python SDK
Instrument any Python agent with provider patches, framework patches, and the @watch decorator.
The Python SDK works two ways. Provider patches intercept LLM API calls directly (framework-agnostic). Framework patches instrument higher-level abstractions in LangChain, CrewAI, AutoGen, and similar tools. Both can be combined.
Install
pip install tokenjam
tj onboard # creates config, generates ingest secret
tj doctor # verify your setup
The @watch decorator
Wrap your agent entry point so every call within it becomes a span tree:
from tokenjam.sdk import watch
from tokenjam.sdk.integrations.anthropic import patch_anthropic
patch_anthropic() # auto-intercepts all Anthropic API calls
@watch(agent_id="my-agent")
def run(task: str) -> str:
# your agent code, nothing else to change
...
@watch opens a session, attributes every nested LLM/tool call to it, and closes the session on return. Combined with a provider patch, you get cost + tokens + tool calls per session with no instrumentation code in the body.
Provider patches
Intercept at the API level. Framework-agnostic.
from tokenjam.sdk.integrations.anthropic import patch_anthropic # Anthropic
from tokenjam.sdk.integrations.openai import patch_openai # OpenAI
from tokenjam.sdk.integrations.gemini import patch_gemini # Google Gemini
from tokenjam.sdk.integrations.bedrock import patch_bedrock # AWS Bedrock
from tokenjam.sdk.integrations.litellm import patch_litellm # LiteLLM (100+ providers)
patch_litellm() covers all providers LiteLLM routes to (OpenAI, Anthropic, Bedrock, Vertex, Cohere, Mistral, Ollama, etc.). If you use LiteLLM, you don’t need individual patches.
OpenAI-compatible providers (Groq, Together, Fireworks, xAI, Azure OpenAI) work via patch_openai(base_url=...).
Framework patches
Instrument the framework’s own abstractions:
from tokenjam.sdk.integrations.langchain import patch_langchain # BaseLLM + BaseTool
from tokenjam.sdk.integrations.langgraph import patch_langgraph # CompiledGraph
from tokenjam.sdk.integrations.crewai import patch_crewai # Task + Agent
from tokenjam.sdk.integrations.autogen import patch_autogen # ConversableAgent
from tokenjam.sdk.integrations.llamaindex import patch_llamaindex # Native OTel
from tokenjam.sdk.integrations.openai_agents_sdk import patch_openai_agents # Native OTel
from tokenjam.sdk.integrations.nemoclaw import watch_nemoclaw # NemoClaw Gateway
See the full supported frameworks list for status and notes per integration.
What gets captured
Every patched call records:
- Provider, model, and model version
- Input/output tokens
- Cost (priced at call time using
pricing.toml) - Latency
- Tool calls invoked during the response
- Errors and retries
By default, prompt and completion content are not captured. Set [capture] flags in your config if you want them stored locally. See Configuration.