Track LLM API calls, token usage, costs, and latency across OpenAI, Anthropic, and custom providers.
AI Monitoring
The AI monitoring module wraps your LLM clients to automatically track every API call — token usage, estimated costs, latency, and errors. It also provides session tracing to correlate multiple calls into a single agent timeline for replay and analysis.
npm install @codmir/sdkQuick Start
import * as CodmirAI from '@codmir/sdk/ai';
import OpenAI from 'openai';
import Anthropic from '@anthropic-ai/sdk';
CodmirAI.init({
dsn: process.env.CODMIR_DSN,
trackTokenUsage: true,
trackCosts: true,
trackLatency: true,
});
// Wrap clients — all calls are now tracked automatically
const openai = CodmirAI.wrapOpenAI(new OpenAI());
const anthropic = CodmirAI.wrapAnthropic(new Anthropic());
// Use them as normal
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello' }],
});
// Check accumulated stats
const summary = CodmirAI.getAIUsageSummary();
console.log(`Total cost: $${summary.totalCost.toFixed(4)}`);
console.log(`Avg latency: ${summary.avgLatencyMs}ms`);Wrapping Providers
OpenAI
import { wrapOpenAI } from '@codmir/sdk/ai';
import OpenAI from 'openai';
const openai = wrapOpenAI(new OpenAI());
// Every chat.completions.create call is now instrumented
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Explain closures in JavaScript' }],
});
// Breadcrumb recorded: openai/gpt-4o — 142 tokensAnthropic
import { wrapAnthropic } from '@codmir/sdk/ai';
import Anthropic from '@anthropic-ai/sdk';
const anthropic = wrapAnthropic(new Anthropic());
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello' }],
});
// Breadcrumb recorded: anthropic/claude-sonnet-4-20250514 — 89 tokensCustom Providers
Use trackAICall for any provider not covered by the built-in wrappers.
import { trackAICall } from '@codmir/sdk/ai';
const result = await trackAICall(
'ollama',
{ model: 'llama-3', messageCount: 1 },
async () => {
return await myOllamaClient.chat({ prompt: 'Hello' });
}
);Usage Statistics
Get Summary
import { getAIUsageSummary } from '@codmir/sdk/ai';
const summary = getAIUsageSummary();
console.log(`Total calls: ${summary.totalCalls}`);
console.log(`Input tokens: ${summary.totalInputTokens}`);
console.log(`Output tokens: ${summary.totalOutputTokens}`);
console.log(`Total cost: $${summary.totalCost.toFixed(4)}`);
console.log(`Avg latency: ${summary.avgLatencyMs}ms`);
// Breakdown by provider
console.log(`OpenAI calls: ${summary.byProvider.openai?.calls ?? 0}`);
console.log(`Anthropic errors: ${summary.byProvider.anthropic?.errors ?? 0}`);
// Breakdown by model
console.log(`GPT-4o cost: $${summary.byModel['gpt-4o']?.cost.toFixed(4) ?? '0'}`);Reset Statistics
import { resetAIUsageStats } from '@codmir/sdk/ai';
// Start a fresh tracking period
resetAIUsageStats();Session Tracing
Sessions correlate multiple LLM calls into a single timeline. Every call made while a session is active is automatically attached. Use sessions to trace agent behavior end-to-end.
Start and End a Session
import * as CodmirAI from '@codmir/sdk/ai';
const session = CodmirAI.startSession({
agent: 'support-bot',
metadata: { userId: 'u_123', ticket: 'T-456' },
tags: { team: 'support', env: 'production' },
});
// All LLM calls are now correlated under this session
await openai.chat.completions.create({ /* ... */ });
await anthropic.messages.create({ /* ... */ });
// End the session
const completed = CodmirAI.endSession();
console.log(`Session made ${completed.usage.totalCalls} calls`);
console.log(`Total cost: $${completed.usage.totalCost.toFixed(4)}`);Decision Tracing
Record why the agent chose a specific action. These events appear in the session timeline alongside LLM calls.
CodmirAI.addSessionEvent('decision', {
action: 'escalate_to_human',
reason: 'confidence_below_threshold',
confidence: 0.42,
});
CodmirAI.addSessionEvent('tool_call', {
tool: 'search_tickets',
query: 'login failed',
results: 12,
});Spans
Group related events within a session using spans (e.g., "handle_user_message", "run_tool_chain").
CodmirAI.startSessionSpan('handle_user_message', { messageId: 'msg_789' });
await openai.chat.completions.create({ /* ... */ });
CodmirAI.addSessionEvent('decision', { action: 'respond_directly' });
CodmirAI.endSessionSpan();Flush to Server
After ending a session, flush it to your Codmir instance for persistent storage and replay.
const session = CodmirAI.endSession();
const success = await CodmirAI.flushSession(session);
if (success) {
console.log('Session persisted for replay');
}Configuration
import { init } from '@codmir/sdk/ai';
init({
// Core overseer config
dsn: process.env.CODMIR_DSN,
environment: 'production',
// Token and cost tracking (all default to true)
trackTokenUsage: true,
trackCosts: true,
trackLatency: true,
// Privacy-sensitive capture (both default to false)
capturePrompts: false,
captureResponses: false,
// Redact patterns from captured content
redactPatterns: [
/sk-[a-zA-Z0-9]+/g, // API keys
/\b\d{4}-\d{4}-\d{4}\b/g, // Card numbers
],
// Custom cost table (per 1K tokens)
costPerModel: {
'my-fine-tuned-model': { input: 0.006, output: 0.012 },
},
});Config Reference
| Option | Type | Default | Description |
|---|---|---|---|
trackTokenUsage | boolean | true | Count input/output tokens per call |
trackCosts | boolean | true | Estimate cost based on model pricing |
trackLatency | boolean | true | Measure response time per call |
capturePrompts | boolean | false | Log prompts sent to providers |
captureResponses | boolean | false | Log responses from providers |
redactPatterns | RegExp[] | [] | Patterns to redact from captured content |
costPerModel | Record | Built-in table | Custom cost per 1K tokens by model |
Built-in Cost Table
| Model | Input (per 1K) | Output (per 1K) |
|---|---|---|
claude-sonnet-4-20250514 | $0.003 | $0.015 |
claude-haiku-4-5-20251001 | $0.001 | $0.005 |
gpt-4o | $0.005 | $0.015 |
gpt-4o-mini | $0.00015 | $0.0006 |
gpt-4.1 | $0.002 | $0.008 |
Override or extend this table via the costPerModel config option.
Error Handling
Errors from wrapped clients are automatically captured with full AI context — provider, model, token count, and latency — then re-thrown so your application's error handling still works.
try {
await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello' }],
});
} catch (error) {
// Error already captured by Codmir with AI context
// Your own error handling still runs
console.error('OpenAI call failed:', error);
}Re-exported Overseer Functions
The AI module re-exports core overseer functions for convenience. You don't need to import from both modules.
import {
captureException,
captureMessage,
setUser,
setTag,
setTags,
addBreadcrumb,
flush,
close,
} from '@codmir/sdk/ai';TypeScript Support
import type {
AIMonitorConfig,
AICallParams,
AICallResult,
AIUsageSummary,
ProviderStats,
ModelStats,
SessionConfig,
SessionEvent,
SessionSpan,
AgentSession,
} from '@codmir/sdk/ai';