Skip to main content

Welcome to TraceLM

TraceLM gives you full visibility into agent behavior, not just prompt/response logs. TraceLM is built for reliability-first observability:
  • Full task timelines across traces and tool calls
  • Tool-call failure visibility (explicit, semantic, silent)
  • Deadlock/loop detection with severity
  • Context-loss detection across conversations
  • Unnecessary tool-call detection via feedback loops

Quick Start

Get to your first reliability signal in under 5 minutes.

Python SDK

OpenAI-compatible client with task and conversation tracking.

TypeScript SDK

Type-safe SDK with the same reliability workflows.

API Reference

Integrate directly over HTTP with clear endpoint contracts.

Reliability-First Workflow

1

Instrument

Route model traffic through TraceLM with an SDK or direct API calls.
2

Group Execution

Use task and conversation IDs to capture complete execution cycles end to end.
3

Run Detection

Use task.complete() or /api/v1/tasks/{task_id}/detect.
4

Review Reliability Signals

Review loops, failures, context-loss, and unnecessary-tool patterns.

What TraceLM Adds

Execution Timeline

See ordered trace/tool/loop events via /api/v1/tasks/{task_id}/timeline.

Failure Analytics

Track explicit, semantic, and silent tool failures per task.

Context Health

Conversation-level context failures and health scoring.

Feedback Loop

Label tool calls as unnecessary and monitor pattern confidence.