Quickstart

noburn sits between your code and the LLM API. Before every call it checks your spend, enforces budgets, and blocks requests that would push you over — returning a structured response your code can handle instead of firing the expensive API call.

Using Claude Code or another AI coding agent? Install the noburn skill and let your agent wire it in for you:
npx skills add noburn-dev/skills
Then ask it to "add budget guardrails to my LLM app." Framework-specific variants are included — --skill noburn-vercel-ai (Vercel AI SDK) and --skill noburn-langchain (LangChain, Python + JS).

Install the SDK

pip install noburn

npm install @noburn/sdk

Get your SDK key

Sign in at noburn.dev/dashboard
Create a project (or open an existing one)
Copy your SDK key — it looks like sk-nb-xxxxxxxxxxxxxxxx

Keep this secret. The key is scoped to a single project and is shown once.

Wrap your first LLM call

from noburn import NoburnGuard
import openai

guard = NoburnGuard(
    api_key="sk-nb-xxxxxxxxxxxxxxxx",
    project_id="your-project-id",
    budget_cap_usd=10.00,   # block if monthly spend exceeds $10
)

# Before every LLM call:
check = guard.check(
    model="gpt-4o",
    estimated_tokens_in=1500,
    estimated_tokens_out=500,
)

if check.blocked:
    # noburn blocked the call — handle gracefully
    print(f"Blocked: {check.block_reason}")
else:
    # Safe to proceed
    response = openai.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}],
    )

    # Record the actual usage after the call
    guard.record(
        model="gpt-4o",
        tokens_in=response.usage.prompt_tokens,
        tokens_out=response.usage.completion_tokens,
        cost_usd=response.usage.prompt_tokens * 0.0000025
                + response.usage.completion_tokens * 0.00001,
        was_blocked=False,
    )

import { NoburnGuard } from '@noburn/sdk';
import OpenAI from 'openai';

const guard = new NoburnGuard({
  apiKey: 'sk-nb-xxxxxxxxxxxxxxxx',
  projectId: 'your-project-id',
  budgetCapUsd: 10.00,  // block if monthly spend exceeds $10
});

const openai = new OpenAI();

// Before every LLM call:
const check = await guard.check({
  model: 'gpt-4o',
  estimatedTokensIn: 1500,
  estimatedTokensOut: 500,
});

if (check.blocked) {
  // noburn blocked the call — handle gracefully
  console.log(`Blocked: ${check.blockReason}`);
} else {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Hello!' }],
  });

  // Record actual usage after the call
  await guard.record({
    model: 'gpt-4o',
    tokensIn: response.usage!.prompt_tokens,
    tokensOut: response.usage!.completion_tokens,
    costUsd: response.usage!.prompt_tokens * 0.0000025
           + response.usage!.completion_tokens * 0.00001,
    wasBlocked: false,
  });
}

See it in the dashboard

Every call — blocked or allowed — shows up in your dashboard within seconds. The blocked request log updates in real time.

What happens when a call is blocked

When noburn blocks a call, your code receives a check object with:

blocked: true
blockReason — why it was blocked ("budget_exceeded", "user_budget_exceeded", "run_budget_exceeded", "policy_rule", "plan_limit_exceeded")
spendUsd — current monthly spend at time of block
budgetCapUsd — the cap that was hit

Your application decides what to show the user. Common patterns:

if check.blocked:
    if check.block_reason == "budget_exceeded":
        return {"error": "Service temporarily unavailable. Try again tomorrow."}
    elif check.block_reason == "user_budget_exceeded":
        return {"error": "You've reached your AI usage limit. Upgrade to continue."}

Next steps

SDK Reference — every parameter and option
API Reference — direct REST API for custom integrations
Webhooks — get notified when budgets are hit