API Reference

The noburn REST API accepts events from any language or runtime. Use it if your language doesn't have an SDK yet, or if you want full control over the request.

Authentication

All API requests use Bearer token authentication with your project's SDK key:

Authorization: Bearer sk-nb-xxxxxxxxxxxxxxxx

SDK keys are project-scoped. A key can only write events to its associated project.

The pre-call decision endpoint — call this before every LLM call to get a server-authoritative allow/block. It resolves the project-, per-user-, and per-run budget caps against month-to-date spend, applies your policy rules, and enforces the plan request quota.

Runtime: Edge (Vercel Edge Functions)

Request

POST https://noburn.dev/api/v1/check
Authorization: Bearer sk-nb-xxxxxxxxxxxxxxxx
Content-Type: application/json

{
  "project_id": "550e8400-e29b-41d4-a716-446655440000",
  "model": "gpt-4o",
  "estimated_tokens_in": 1500,
  "estimated_tokens_out": 500,
  "end_user_id": "user_abc123",
  "run_id": null,
  "context": { "plan": "free" }
}

Request body

Field	Type	Required	Description
`project_id`	`string (uuid)`	✓	Must match the SDK key's project
`model`	`string`	✓	Model identifier. Max 255 characters
`estimated_tokens_in`	`number`	—	Estimated prompt tokens. Used to price the call server-side for recognized models
`estimated_tokens_out`	`number`	—	Estimated completion tokens
`estimated_cost_usd`	`number`	—	Fallback cost for models noburn doesn't recognize. Ignored when token counts price the call
`end_user_id`	`string`	—	Enforces this user's per-user budget cap
`run_id`	`string`	—	Enforces this agent run's per-run budget cap (see `/api/v1/runs`)
`context`	`object`	—	String key-value pairs matched against policy rules

The server prices recognized models from estimated_tokens_in/out using its own pricing table — a client cannot send a cost of 0 (or a forged value) to slip under a cap.

Response

{
  "allowed": true,
  "reason": null,
  "forced_model_tier": null,
  "estimated_cost_usd": 0.00875,
  "input_rate_per_m": 2.5,
  "output_rate_per_m": 10.0,
  "pricing_source": "server_table",
  "project_spend_usd": 4.20,
  "project_cap_usd": 10.0,
  "project_remaining_usd": 5.80,
  "user_spend_usd": 0.0,
  "user_cap_usd": null,
  "user_remaining_usd": null,
  "run_spend_usd": 0.0,
  "run_cap_usd": null,
  "run_remaining_usd": null
}

Field	Type	Description
`allowed`	`boolean`	`false` if the call should be blocked
`reason`	`string \| null`	One of `budget_exceeded`, `user_budget_exceeded`, `run_budget_exceeded`, `policy_rule`, `plan_limit_exceeded` — or `null` when allowed
`forced_model_tier`	`"small" \| "big" \| null`	Set by a `force_small`/`force_big` policy rule — switch to a cheaper/stronger model
`estimated_cost_usd`	`number`	Server-computed cost used for this budget check
`input_rate_per_m` / `output_rate_per_m`	`number \| null`	USD per 1M tokens used for pricing (null for pure client estimates)
`pricing_source`	`string`	`server_table`, `custom_override`, `conservative_unknown`, or `client_estimate`
`project_spend_usd` / `project_cap_usd` / `project_remaining_usd`	`number \| null`	Project budget state
`user_*`	`number \| null`	Per-user budget state (when `end_user_id` is supplied)
`run_*`	`number \| null`	Per-run budget state (when `run_id` is supplied)

POST /api/v1/events

Records an LLM event (blocked or allowed). This is the core ingestion endpoint — the SDK wraps this call.

Runtime: Edge (Vercel Edge Functions) Target latency: p99 < 150ms

The server returns 202 immediately and processes the event asynchronously. Budget evaluation, alert rule checks, and webhook dispatches happen after the response.

Request

POST https://noburn.dev/api/v1/events
Authorization: Bearer sk-nb-xxxxxxxxxxxxxxxx
Content-Type: application/json

{
  "project_id": "550e8400-e29b-41d4-a716-446655440000",
  "model": "gpt-4o",
  "tokens_in": 1423,
  "tokens_out": 487,
  "cost_usd": 0.00848,
  "was_blocked": false,
  "end_user_id": "user_abc123",
  "block_reason": null,
  "latency_ms": 1240,
  "timestamp": "2025-01-15T14:32:00.000Z"
}

Request body

Field	Type	Required	Description
`project_id`	`string (uuid)`	✓	Must match the project associated with your SDK key
`model`	`string`	✓	Model identifier. Max 255 characters
`tokens_in`	`number`	✓	Prompt token count
`tokens_out`	`number`	✓	Completion token count
`cost_usd`	`number`	✓	Cost in USD. Use your provider's per-token pricing
`was_blocked`	`boolean`	✓	`true` if the call was blocked before reaching the LLM
`end_user_id`	`string`	—	Per-user identifier for spend tracking. Max 255 characters
`block_reason`	`string`	—	Why the call was blocked. Max 500 characters
`latency_ms`	`number`	—	End-to-end call latency in milliseconds
`timestamp`	`string (ISO 8601)`	—	Defaults to server time if omitted

Response

HTTP/1.1 202 Accepted

No response body. A 202 means the event was accepted and queued for processing.

Error responses

Status	Code	Reason
`400`	`BAD_REQUEST`	Missing required fields, invalid types, or negative/non-finite `tokens_*`/`cost_usd`
`401`	`INVALID_SDK_KEY`	Missing, malformed, or invalid `Authorization` header
`403`	`FORBIDDEN`	SDK key is valid but `project_id` doesn't match
`429`	`RATE_LIMIT`	Exceeded 1,000 requests/minute per IP

Rate limits

Tier	Limit
All plans	1,000 requests/min per IP

Set Retry-After header is included on 429 responses.

Example — raw fetch (Node.js)

await fetch('https://noburn.dev/api/v1/events', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.NOBURN_SDK_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    project_id: process.env.NOBURN_PROJECT_ID,
    model: 'gpt-4o',
    tokens_in: 1423,
    tokens_out: 487,
    cost_usd: 0.00848,
    was_blocked: false,
    latency_ms: 1240,
  }),
});

Example — curl

curl -X POST https://noburn.dev/api/v1/events \
  -H "Authorization: Bearer sk-nb-xxxxxxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "project_id": "550e8400-e29b-41d4-a716-446655440000",
    "model": "claude-3-5-sonnet-20241022",
    "tokens_in": 800,
    "tokens_out": 200,
    "cost_usd": 0.00284,
    "was_blocked": true,
    "block_reason": "budget_exceeded"
  }'

POST /api/v1/runs

Starts an agent run — a bounded unit of work that can carry its own budget cap, independent of the project- and per-user caps. Thread the returned run_id through /api/v1/check and /api/v1/events so per-run spend is enforced and reported.

Request

{
  "project_id": "550e8400-e29b-41d4-a716-446655440000",
  "budget_cap_usd": 0.50,
  "end_user_id": "user_abc123",
  "metadata": { "task": "summarize" }
}

Field	Type	Required	Description
`project_id`	`string (uuid)`	✓	Must match the SDK key's project
`budget_cap_usd`	`number`	—	Hard cap for this run's spend
`end_user_id`	`string`	—	Attribute the run to an end user
`metadata`	`object`	—	Arbitrary JSON stored with the run

Response — `201 Created`

{ "run_id": "…", "budget_cap_usd": 0.50, "started_at": "2026-01-15T14:32:00.000Z" }

POST /api/v1/runs/{id}/end

Finalizes a run. Totals its non-blocked spend and marks it completed or budget_exceeded. Idempotent — ending an already-finished run returns its existing state.

Request

{ "project_id": "550e8400-e29b-41d4-a716-446655440000" }

Response

{
  "run_id": "…",
  "status": "completed",
  "spend_usd": 0.32,
  "budget_cap_usd": 0.50,
  "started_at": "2026-01-15T14:32:00.000Z",
  "ended_at": "2026-01-15T14:33:10.000Z"
}

Cost calculation reference

noburn is the pricing authority for budget enforcement. The SDK sends token counts; the server recomputes cost on every check and events call. Dashboard spend is an estimate based on noburn's table — it may differ from your provider invoice.

Pricing sources

Each stored event includes the rates used at ingest time:

`pricing_source`	Meaning
`server_table`	Matched the global noburn pricing table
`custom_override`	Matched a per-project rate you configured
`conservative_unknown`	Unknown model — floored at Opus-tier rates ($30/$75 per 1M in/out)
`client_estimate`	Used the `cost_usd` / `estimated_cost_usd` you supplied

GET /api/v1/pricing

Fetch the current global table, version metadata, and your project's custom overrides. Optional — existing SDK flows work unchanged.

GET https://noburn.dev/api/v1/pricing?project_id=<uuid>
Authorization: Bearer sk-nb-xxxxxxxxxxxxxxxx

{
  "version": "2026-06-19",
  "effective_at": "2026-06-19T00:00:00Z",
  "global_table": [{ "model_prefix": "gpt-4o", "input_per_m": 2.5, "output_per_m": 10.0 }],
  "project_overrides": [{ "model_prefix": "my-finetune", "input_per_m": 1.0, "output_per_m": 2.0 }],
  "unknown_model_floor": { "input_per_m": 30, "output_per_m": 75 }
}

Per-project custom rates

Configure gateway pass-through or fine-tuned model rates via the dashboard API:

POST /api/projects/{id}/model-pricing
{ "model_prefix": "my-finetune", "input_per_m": 1.0, "output_per_m": 2.0 }

Overrides take precedence over the global table. Prefix matching applies (longest match wins).

For recognized models, noburn recomputes cost server-side from tokens_in/out. For unrecognized models, pass accurate token counts (conservative floor applies) or an explicit estimated_cost_usd / cost_usd.

GET /api/v1/policy

Fetches the enabled policy rules for a project. The SDK calls this on initialization to load server-configured rules.

Runtime: Edge (Vercel Edge Functions)

Query parameters

Parameter	Required	Description
`project_id`	✓	Your project UUID

Request

GET /api/v1/policy?project_id=<your-project-id>
Authorization: Bearer sk-nb-xxxxxxxxxxxxxxxx

Response

{
  "rules": [
    { "when": { "utilization": ">= 0.8" }, "then": "force_small" },
    { "when": { "model": "gpt-4o" },       "then": "block" }
  ]
}

Rules are returned in evaluation order (position ascending). Only enabled rules are included. An empty rules array means no rules are configured — all calls proceed normally.

Error responses

Status	Meaning
`401`	Missing or invalid SDK key
`403`	SDK key does not belong to the requested `project_id`
`400`	`project_id` query param missing

Error codes

Code	HTTP Status	Description
`UNAUTHORIZED`	401	Not signed in
`FORBIDDEN`	403	Signed in but access denied
`INVALID_SDK_KEY`	401	SDK key missing or invalid
`NOT_FOUND`	404	Resource doesn't exist
`BAD_REQUEST`	400	Invalid request body
`RATE_LIMIT`	429	Too many requests
`PLAN_LIMIT_EXCEEDED`	402	Action requires plan upgrade
`INTERNAL_ERROR`	500	Server error — retry with backoff

Authentication

POST /api/v1/check

Request

Request body

Response

POST /api/v1/events

Request

Request body

Response

Error responses

Rate limits

Example — raw fetch (Node.js)

Example — curl

POST /api/v1/runs

Request

Response — `201 Created`

POST /api/v1/runs/{id}/end

Request

Response

Cost calculation reference

Pricing sources

GET /api/v1/pricing

Per-project custom rates

GET /api/v1/policy

Query parameters

Request

Response

Error responses

Error codes

On this page