Track LLM API calls, token usage, costs, and latency across OpenAI, Anthropic, and custom providers.

AI Monitoring

The AI monitoring module wraps your LLM clients to automatically track every API call — token usage, estimated costs, latency, and errors. It also provides session tracing to correlate multiple calls into a single agent timeline for replay and analysis.

npm install @codmir/sdk

Quick Start

import * as CodmirAI from '@codmir/sdk/ai';
import OpenAI from 'openai';
import Anthropic from '@anthropic-ai/sdk';

CodmirAI.init({
  dsn: process.env.CODMIR_DSN,
  trackTokenUsage: true,
  trackCosts: true,
  trackLatency: true,
});

// Wrap clients — all calls are now tracked automatically
const openai = CodmirAI.wrapOpenAI(new OpenAI());
const anthropic = CodmirAI.wrapAnthropic(new Anthropic());

// Use them as normal
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello' }],
});

// Check accumulated stats
const summary = CodmirAI.getAIUsageSummary();
console.log(`Total cost: $${summary.totalCost.toFixed(4)}`);
console.log(`Avg latency: ${summary.avgLatencyMs}ms`);

Wrapping Providers

OpenAI

import { wrapOpenAI } from '@codmir/sdk/ai';
import OpenAI from 'openai';

const openai = wrapOpenAI(new OpenAI());

// Every chat.completions.create call is now instrumented
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Explain closures in JavaScript' }],
});
// Breadcrumb recorded: openai/gpt-4o — 142 tokens

Anthropic

import { wrapAnthropic } from '@codmir/sdk/ai';
import Anthropic from '@anthropic-ai/sdk';

const anthropic = wrapAnthropic(new Anthropic());

const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello' }],
});
// Breadcrumb recorded: anthropic/claude-sonnet-4-20250514 — 89 tokens

Custom Providers

Use trackAICall for any provider not covered by the built-in wrappers.

import { trackAICall } from '@codmir/sdk/ai';

const result = await trackAICall(
  'ollama',
  { model: 'llama-3', messageCount: 1 },
  async () => {
    return await myOllamaClient.chat({ prompt: 'Hello' });
  }
);

Usage Statistics

Get Summary

import { getAIUsageSummary } from '@codmir/sdk/ai';

const summary = getAIUsageSummary();

console.log(`Total calls: ${summary.totalCalls}`);
console.log(`Input tokens: ${summary.totalInputTokens}`);
console.log(`Output tokens: ${summary.totalOutputTokens}`);
console.log(`Total cost: $${summary.totalCost.toFixed(4)}`);
console.log(`Avg latency: ${summary.avgLatencyMs}ms`);

// Breakdown by provider
console.log(`OpenAI calls: ${summary.byProvider.openai?.calls ?? 0}`);
console.log(`Anthropic errors: ${summary.byProvider.anthropic?.errors ?? 0}`);

// Breakdown by model
console.log(`GPT-4o cost: $${summary.byModel['gpt-4o']?.cost.toFixed(4) ?? '0'}`);

Reset Statistics

import { resetAIUsageStats } from '@codmir/sdk/ai';

// Start a fresh tracking period
resetAIUsageStats();

Session Tracing

Sessions correlate multiple LLM calls into a single timeline. Every call made while a session is active is automatically attached. Use sessions to trace agent behavior end-to-end.

Start and End a Session

import * as CodmirAI from '@codmir/sdk/ai';

const session = CodmirAI.startSession({
  agent: 'support-bot',
  metadata: { userId: 'u_123', ticket: 'T-456' },
  tags: { team: 'support', env: 'production' },
});

// All LLM calls are now correlated under this session
await openai.chat.completions.create({ /* ... */ });
await anthropic.messages.create({ /* ... */ });

// End the session
const completed = CodmirAI.endSession();
console.log(`Session made ${completed.usage.totalCalls} calls`);
console.log(`Total cost: $${completed.usage.totalCost.toFixed(4)}`);

Decision Tracing

Record why the agent chose a specific action. These events appear in the session timeline alongside LLM calls.

CodmirAI.addSessionEvent('decision', {
  action: 'escalate_to_human',
  reason: 'confidence_below_threshold',
  confidence: 0.42,
});

CodmirAI.addSessionEvent('tool_call', {
  tool: 'search_tickets',
  query: 'login failed',
  results: 12,
});

Spans

Group related events within a session using spans (e.g., "handle_user_message", "run_tool_chain").

CodmirAI.startSessionSpan('handle_user_message', { messageId: 'msg_789' });

await openai.chat.completions.create({ /* ... */ });
CodmirAI.addSessionEvent('decision', { action: 'respond_directly' });

CodmirAI.endSessionSpan();

Flush to Server

After ending a session, flush it to your Codmir instance for persistent storage and replay.

const session = CodmirAI.endSession();
const success = await CodmirAI.flushSession(session);
if (success) {
  console.log('Session persisted for replay');
}

Configuration

import { init } from '@codmir/sdk/ai';

init({
  // Core overseer config
  dsn: process.env.CODMIR_DSN,
  environment: 'production',

  // Token and cost tracking (all default to true)
  trackTokenUsage: true,
  trackCosts: true,
  trackLatency: true,

  // Privacy-sensitive capture (both default to false)
  capturePrompts: false,
  captureResponses: false,

  // Redact patterns from captured content
  redactPatterns: [
    /sk-[a-zA-Z0-9]+/g,        // API keys
    /\b\d{4}-\d{4}-\d{4}\b/g,  // Card numbers
  ],

  // Custom cost table (per 1K tokens)
  costPerModel: {
    'my-fine-tuned-model': { input: 0.006, output: 0.012 },
  },
});

Config Reference

Option	Type	Default	Description
`trackTokenUsage`	`boolean`	`true`	Count input/output tokens per call
`trackCosts`	`boolean`	`true`	Estimate cost based on model pricing
`trackLatency`	`boolean`	`true`	Measure response time per call
`capturePrompts`	`boolean`	`false`	Log prompts sent to providers
`captureResponses`	`boolean`	`false`	Log responses from providers
`redactPatterns`	`RegExp[]`	`[]`	Patterns to redact from captured content
`costPerModel`	`Record`	Built-in table	Custom cost per 1K tokens by model

Built-in Cost Table

Model	Input (per 1K)	Output (per 1K)
`claude-sonnet-4-20250514`	$0.003	$0.015
`claude-haiku-4-5-20251001`	$0.001	$0.005
`gpt-4o`	$0.005	$0.015
`gpt-4o-mini`	$0.00015	$0.0006
`gpt-4.1`	$0.002	$0.008

Override or extend this table via the costPerModel config option.

Error Handling

Errors from wrapped clients are automatically captured with full AI context — provider, model, token count, and latency — then re-thrown so your application's error handling still works.

try {
  await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Hello' }],
  });
} catch (error) {
  // Error already captured by Codmir with AI context
  // Your own error handling still runs
  console.error('OpenAI call failed:', error);
}

Re-exported Overseer Functions

The AI module re-exports core overseer functions for convenience. You don't need to import from both modules.

import {
  captureException,
  captureMessage,
  setUser,
  setTag,
  setTags,
  addBreadcrumb,
  flush,
  close,
} from '@codmir/sdk/ai';

TypeScript Support

import type {
  AIMonitorConfig,
  AICallParams,
  AICallResult,
  AIUsageSummary,
  ProviderStats,
  ModelStats,
  SessionConfig,
  SessionEvent,
  SessionSpan,
  AgentSession,
} from '@codmir/sdk/ai';

Agent SDK

Autonomous agents that use AI monitoring under the hood

Error Tracking

Core error tracking and breadcrumbs

Session Replay

Browser session replay for user debugging