#observability

7 posts

Jun 1, 2026 · 8 min read

Reddit is 40% of your agent's retrieval surface

What 150K LLM citations tell builders about prompt-time grounding, eval coverage, and the source biases their agents inherit by default.
May 25, 2026 · 14 min read

Watching Claude Code with OTel: what Cursor and /cost won't show you

Claude Code ships a real OpenTelemetry pipe. Cursor doesn't. /cost is per-session and read-only. Here's what you can do with the wire, what each surface actually emits, and the failure modes none of the built-in views catch.
May 21, 2026 · 13 min read

LangSmith costs $39/seat. And 10.7x that in real TCO. What self-hosted alternatives actually cost in 2026.

A pricing teardown of LangSmith (and the new SmithDB / LangSmith Engine launch), Langfuse self-host, and a local-first DuckDB alternative. Real numbers, real config, real cost-of-running.
May 18, 2026 · 13 min read

How to monitor Claude Code: a practical guide for indie devs running it unsupervised

A step-by-step guide to monitoring Claude Code on your own laptop. Turn on Anthropic's OTel telemetry, route the spans somewhere useful, and wire up alerts that fire while the agent is still running.
May 17, 2026 · 15 min read

Behavioral drift detection for AI agents

A technical deep-dive on detecting when an agent's behavior wanders off its baseline. Using Z-scores on token / duration / tool-count distributions and Jaccard similarity on tool sequences, run locally over your own session history.
May 10, 2026 · 7 min read

What is OpenTelemetry, and why does it matter for AI agents?

OpenTelemetry, OTLP, and the GenAI semantic conventions: how the CNCF observability standard is becoming the lingua franca for AI agent telemetry.
May 9, 2026 · 8 min read

What is agent observability?

How AI agent observability works: capturing tool calls, token costs, traces, and behavioral patterns at production scale.