noburn.devdocs

Policy Rules

Configure per-project LLM call governance rules from the dashboard — no redeploy needed.

Policy rules let you control LLM call behavior at the gateway level. Rules are configured in the noburn dashboard and evaluated before each call — without touching your application code or triggering a deployment.

How rules work

Rules are evaluated top-to-bottom. The first rule that matches the current call wins. If no rule matches, the call proceeds normally.

Each rule has two parts:

  • When — a condition that must be true for the rule to apply
  • Then — what to do when the condition matches

Available actions

ActionWhat happens
blockCall is rejected. No tokens sent. Returns blocked: true.
force_smallRouter is forced to use the cheapest available model
force_bigRouter is forced to use the most capable available model
allowCall is explicitly allowed (use to override an earlier block rule)

Condition keys

KeyTypeDescriptionExample
utilization0.0 – 1.0Fraction of monthly budget already spent>= 0.8
complexity0.0 – 1.0Prompt complexity score from the router< 0.4
modelstringThe model the router selectedgpt-4o
domainstringRouting domain: code, math, analysis, creative, generalcreative
customanyAny key you pass in context when calling the SDKplan == free

Configuring rules in the dashboard

Go to your project → Policy rulesAdd rule.

The form guides you through each condition key with appropriate inputs — no manual typing of model names or operator syntax.

Rules take effect immediately after saving. baar-core fetches the latest rules on BAARRouter initialization.

Example rules

Block GPT-4o when budget is 80% spent:

WhenThen
utilization >= 0.8Force cheap model

Block all calls to GPT-4o for free-plan users:

WhenThen
model gpt-4oBlock call

Then add a second rule to allow GPT-4o-mini for the same users:

WhenThen
plan freeForce cheap model

Route creative tasks to a powerful model:

WhenThen
domain creativeForce powerful model

Passing custom context

To match on custom keys like plan, pass a context dict when calling the SDK:

result = guard.check(
    model="gpt-4o",
    estimated_tokens_in=1000,
    estimated_tokens_out=300,
    end_user_id="user_abc123",
    context={"plan": "free"},   # matched against policy rules
)
const result = await guard.check({
  model: 'gpt-4o',
  estimatedTokensIn: 1000,
  estimatedTokensOut: 300,
  endUserId: 'user_abc123',
  context: { plan: 'free' },   // matched against policy rules
});

The context object can contain any key-value pairs. Keys must match exactly what you've configured in the dashboard.

SDK fetch endpoint

If you're using baar-core directly, it fetches your project's rules on BAARRouter initialization:

GET /api/v1/policy?project_id=<your-project-id>
Authorization: Bearer sk-nb-xxxxxxxxxxxxxxxx

Response:

{
  "rules": [
    { "when": { "utilization": ">= 0.8" }, "then": "force_small" },
    { "when": { "model": "gpt-4o" },       "then": "block" }
  ]
}

Only enabled rules are returned, ordered by position.

Rule evaluation order

Rules are evaluated in the order they appear in the dashboard (position 1, 2, 3…). You can reorder them by updating positions — first match wins, so order matters.

A common pattern: put the most specific rules first (e.g. model == gpt-4o AND plan == enterprise → allow) before broader rules (e.g. model == gpt-4o → block).

On this page