Policy Rules
Configure per-project LLM call governance rules from the dashboard — no redeploy needed.
Policy rules let you control LLM call behavior at the gateway level. Rules are configured in the noburn dashboard and evaluated before each call — without touching your application code or triggering a deployment.
How rules work
Rules are evaluated top-to-bottom. The first rule that matches the current call wins. If no rule matches, the call proceeds normally.
Each rule has two parts:
- When — a condition that must be true for the rule to apply
- Then — what to do when the condition matches
Available actions
| Action | What happens |
|---|---|
block | Call is rejected. No tokens sent. Returns blocked: true. |
force_small | Router is forced to use the cheapest available model |
force_big | Router is forced to use the most capable available model |
allow | Call is explicitly allowed (use to override an earlier block rule) |
Condition keys
| Key | Type | Description | Example |
|---|---|---|---|
utilization | 0.0 – 1.0 | Fraction of monthly budget already spent | >= 0.8 |
complexity | 0.0 – 1.0 | Prompt complexity score from the router | < 0.4 |
model | string | The model the router selected | gpt-4o |
domain | string | Routing domain: code, math, analysis, creative, general | creative |
| custom | any | Any key you pass in context when calling the SDK | plan == free |
Configuring rules in the dashboard
Go to your project → Policy rules → Add rule.
The form guides you through each condition key with appropriate inputs — no manual typing of model names or operator syntax.
Rules take effect immediately after saving. baar-core fetches the latest rules on BAARRouter initialization.
Example rules
Block GPT-4o when budget is 80% spent:
| When | Then | |
|---|---|---|
utilization >= 0.8 | → | Force cheap model |
Block all calls to GPT-4o for free-plan users:
| When | Then | |
|---|---|---|
model gpt-4o | → | Block call |
Then add a second rule to allow GPT-4o-mini for the same users:
| When | Then | |
|---|---|---|
plan free | → | Force cheap model |
Route creative tasks to a powerful model:
| When | Then | |
|---|---|---|
domain creative | → | Force powerful model |
Passing custom context
To match on custom keys like plan, pass a context dict when calling the SDK:
result = guard.check(
model="gpt-4o",
estimated_tokens_in=1000,
estimated_tokens_out=300,
end_user_id="user_abc123",
context={"plan": "free"}, # matched against policy rules
)const result = await guard.check({
model: 'gpt-4o',
estimatedTokensIn: 1000,
estimatedTokensOut: 300,
endUserId: 'user_abc123',
context: { plan: 'free' }, // matched against policy rules
});The context object can contain any key-value pairs. Keys must match exactly what you've configured in the dashboard.
SDK fetch endpoint
If you're using baar-core directly, it fetches your project's rules on BAARRouter initialization:
GET /api/v1/policy?project_id=<your-project-id>
Authorization: Bearer sk-nb-xxxxxxxxxxxxxxxxResponse:
{
"rules": [
{ "when": { "utilization": ">= 0.8" }, "then": "force_small" },
{ "when": { "model": "gpt-4o" }, "then": "block" }
]
}Only enabled rules are returned, ordered by position.
Rule evaluation order
Rules are evaluated in the order they appear in the dashboard (position 1, 2, 3…). You can reorder them by updating positions — first match wins, so order matters.
A common pattern: put the most specific rules first (e.g. model == gpt-4o AND plan == enterprise → allow) before broader rules (e.g. model == gpt-4o → block).