Skip to main content
TraceLM includes a powerful verification engine that checks LLM outputs against multiple sources to detect potential hallucinations and ensure accuracy.

How It Works

1

Extract Claims

TraceLM parses LLM responses and extracts individual factual claims that can be verified.
2

Run Verifiers

Each claim is checked against enabled verifiers (knowledge base, web search, etc.).
3

Gather Evidence

Evidence supporting or contradicting each claim is collected.
4

Calculate Trust Score

A trust score (0-100) is calculated based on verification results.

Verifiers

Knowledge Base

Check claims against your uploaded facts and domain knowledge.
  • Best for: Domain-specific verification
  • Speed: Fast
  • Accuracy: High (when facts are available)

Web Search

Search the web for evidence supporting or contradicting claims.
  • Best for: Current events, general facts
  • Speed: Medium
  • Accuracy: Good

Multi-Model

Cross-reference claims with other LLMs for consistency checking.
  • Best for: Complex claims, reasoning validation
  • Speed: Slow
  • Accuracy: Good

Citation Validator

Validate cited sources by fetching and checking them.
  • Best for: Responses with citations
  • Speed: Medium
  • Accuracy: High

Trust Score

The trust score is a 0-100 value indicating confidence in the LLM output’s accuracy.
Score RangeLevelInterpretation
80-100HighResponse is well-supported by evidence
60-79MediumMost claims verified, some uncertain
40-59LowSignificant unverified or contradicted claims
0-39Very LowResponse likely contains hallucinations

Claim Verdicts

Each extracted claim receives one of three verdicts:
Evidence was found that supports the claim. The claim is likely accurate.
{
  "claim_text": "Paris is the capital of France",
  "verdict": "supported",
  "confidence": 0.98
}
Evidence was found that contradicts the claim. The claim is likely a hallucination.
{
  "claim_text": "Elon Musk founded Tesla",
  "verdict": "contradicted",
  "confidence": 0.88
}
No sufficient evidence was found to verify or contradict the claim.
{
  "claim_text": "The meeting is scheduled for Tuesday",
  "verdict": "unverifiable",
  "confidence": 0.50
}

Quick Start

Automatic Verification

Enable verification for all requests:
from tracelm import TraceLM

tracelm = TraceLM(
    api_key="lt_your-key",
    openai_api_key="sk-your-key",
    auto_verify=True  # Enable for all requests
)

response = tracelm.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me about Tesla's history"}]
)

# Access verification results
if response.tracelm_verification:
    print(f"Trust Score: {response.tracelm_verification.trust_score}")
    for claim in response.tracelm_verification.claims:
        print(f"  {claim.verdict}: {claim.claim_text}")

Manual Verification

Verify specific traces on demand:
# Get a trace ID from a previous request
trace_id = response.tracelm_trace_id

# Run verification
result = tracelm.verify(
    trace_id=trace_id,
    verifiers=["knowledge_base", "web_search"]
)

print(f"Trust Score: {result.trust_score}")

Knowledge Base

The knowledge base verifier is most effective when you provide domain-specific facts. You can manage your knowledge base from the TraceLM dashboard:
  • Create and manage facts
  • Bulk import from files
  • Organize by categories and sources
  • Configure verification settings

Best Practices

  • Use knowledge base for domain-specific applications
  • Use web search for current events and general knowledge
  • Use multi-model for complex reasoning tasks
  • Use citation validator when responses include references
The knowledge base verifier is only as good as your facts. Invest in building a comprehensive set of verified facts for your domain.
Verification can add latency. Consider running it asynchronously and displaying results when ready, rather than blocking the user experience.
Define trust score thresholds for your use case. A customer support bot may need 80+ while a creative writing assistant may accept 60+.

API Reference