Budgets
Budgets are spending ceilings enforced by the proxy. When the estimated cost of a request would push spend over the limit, the proxy returns `429` — the reque
Budgets are spending ceilings enforced by the proxy. When the estimated cost of a request would push spend over the limit, the proxy returns 429 — the request never reaches the provider and you are not charged.
For a task-oriented setup guide, see Budget Configuration.
How Budgets Work
Request arrives
│
├─ 1. Estimate cost (input tokens + max output tokens × 1.1 safety margin)
│
├─ 2. Period reset ────────── due? ───────► Reset spend to 0, start new period
│
├─ 3. Session limit check ─── exceeds? ──► 429 session_limit_exceeded
│
├─ 4. Velocity check ──────── tripped? ───► 429 velocity_exceeded + Retry-After
│
├─ 5. Budget check ────────── exceeds? ──► 429 budget_exceeded
│
├─ 6. Reserve estimated cost (30s TTL)
│
├─ 7. Forward request to provider
│
├─ 8. Receive response, calculate actual cost
│
└─ 9. Reconcile: apply actual cost, release reservationBudget enforcement uses a Cloudflare Durable Object with embedded SQLite. All checks and mutations are serialized — no race conditions, even under concurrent load.
Budget Entity Types
| Entity Type | What It Scopes To |
|---|---|
user | All requests from a user account (across all their API keys) |
api_key | Requests from a specific API key |
tag | Requests carrying a specific tag key-value pair |
A single request can match multiple budgets (e.g., a user budget + an API key budget + a tag budget). All matching budgets must have sufficient remaining balance for the request to proceed.
Configuration
| Field | Type | Description |
|---|---|---|
maxBudgetMicrodollars | integer (required) | Spending ceiling in microdollars. $50 = 50,000,000 |
resetInterval | string or null | "daily", "weekly", "monthly", or null (no reset — manual only) |
thresholdPercentages | integer[] | Webhook alert thresholds. Default: [50, 80, 90, 95]. Max 10 values, must be ascending, each 1–100. |
velocityLimitMicrodollars | integer or null | Max spend per velocity window. See Velocity Limits. |
velocityWindowSeconds | integer | Sliding window size. Range: 10–3600. Default: 60. |
velocityCooldownSeconds | integer | Block duration after velocity trip. Range: 10–3600. Default: 60. |
sessionLimitMicrodollars | integer or null | Per-session spending cap. See Session Limits. |
Enforcement Lifecycle
The proxy checks budgets in this exact order. A denial at any step stops the pipeline — later steps are not evaluated.
1. Period Reset
If the budget has a resetInterval and the current period has elapsed, spend resets to 0 and a new period starts. A budget.reset webhook fires.
2. Session Limit Check
If the budget has a sessionLimitMicrodollars and the request includes an X-NullSpend-Session header, the proxy checks cumulative spend for that session. If currentSessionSpend + estimatedCost > sessionLimit, the request is denied.
3. Velocity Check (Circuit Breaker)
If the budget has a velocityLimitMicrodollars, the proxy checks spend within the sliding window. The velocity check uses a circuit breaker pattern:
- Closed (normal): requests pass through, velocity spend is tracked
- Open (tripped): all requests are denied until the cooldown expires
- Recovery: after cooldown, the breaker resets and a
velocity.recoveredwebhook fires
If estimatedSpend + estimate > velocityLimit, the breaker trips.
4. Budget Exhaustion Check
If currentSpend + reservations + estimatedCost > maxBudget, the request is denied. Only budgets with strict_block policy deny requests (this is the default).
5. Reservation
The estimated cost is reserved for 30 seconds. Reservations prevent concurrent requests from collectively exceeding the budget. If a reservation expires (upstream timeout, crash), it is automatically cleaned up.
6. Reconciliation
After the provider responds, the proxy calculates the actual cost and reconciles:
- Adds actual cost to cumulative spend
- Removes the reservation
- Adjusts session spend by
actualCost - estimatedCost
429 Response Bodies
Budget Exceeded
{
"error": {
"code": "budget_exceeded",
"message": "Request blocked: estimated cost exceeds remaining budget",
"details": null
}
}Velocity Exceeded
{
"error": {
"code": "velocity_exceeded",
"message": "Request blocked: spending rate exceeds velocity limit. Retry after cooldown.",
"details": {
"limitMicrodollars": 10000000,
"windowSeconds": 60,
"currentMicrodollars": 9500000
}
}
}The response includes a Retry-After header with the cooldown duration in seconds.
Session Limit Exceeded
{
"error": {
"code": "session_limit_exceeded",
"message": "Request blocked: session spend exceeds session limit. Start a new session.",
"details": {
"session_id": "conv_abc123",
"session_spend_microdollars": 4800000,
"session_limit_microdollars": 5000000
}
}
}No Retry-After header — the session is done. Start a new session (new X-NullSpend-Session value) to continue.
Tag Budget Exceeded
{
"error": {
"code": "tag_budget_exceeded",
"message": "Request blocked: tag budget exceeded",
"details": {
"tag_key": "team",
"tag_value": "billing",
"budget_limit_microdollars": 50000000,
"budget_spend_microdollars": 49500000
}
}
}Velocity Limits
Velocity limits catch runaway loops — an agent stuck in a retry cycle can burn through a budget in seconds.
How it works:
- The proxy tracks spend within a sliding window (e.g., $10 in 60 seconds)
- When spend exceeds the limit, a circuit breaker trips
- All requests are blocked for the cooldown period
- After cooldown, the breaker resets and requests resume
- A
velocity.recoveredwebhook fires on recovery
Configuration:
| Field | Range | Default | Description |
|---|---|---|---|
velocityLimitMicrodollars | > 0 | null (disabled) | Max spend per window |
velocityWindowSeconds | 10–3600 | 60 | Sliding window size |
velocityCooldownSeconds | 10–3600 | 60 | Block duration after trip |
Example: $10 velocity limit with 60s window and 60s cooldown means: if your agents spend more than $10 within any 60-second sliding window, all requests are blocked for 60 seconds.
For the full reference — sliding window algorithm, circuit breaker states, and webhook payloads — see Velocity Limits.
Session Limits
Session limits cap how much a single agent conversation can spend, regardless of the overall budget.
How it works:
- Your agent sets
X-NullSpend-Session: conv_abc123on each request - The proxy tracks cumulative spend per session ID
- When a session's spend exceeds the limit, the request is blocked
- The agent should start a new session (new ID) to continue
Key behaviors:
- No header = no enforcement. Session limits only apply when
X-NullSpend-Sessionis present. - Client-defined sessions. The proxy does not manage session lifecycle — your agent decides when to start a new session.
- Independent of budget resets. Session spend does NOT reset when the budget period resets.
- Always strict. Session limits are hard caps regardless of the budget policy.
- 24-hour cleanup. Stale session data is automatically cleaned up after 24 hours of inactivity.
For the full reference — session tracking internals, header usage, and webhook payloads — see Session Limits.
Threshold Alerts
When spend crosses a threshold percentage, a webhook fires:
- Thresholds ≥ 90% fire as
budget.threshold.critical - Thresholds < 90% fire as
budget.threshold.warning
Default thresholds are [50, 80, 90, 95]. Customize per budget with up to 10 values (ascending, each 1–100).
See Webhook Event Types for payload details.
Creating a Budget
Dashboard
- Go to Budgets → Set Budget
- Choose entity (your account or a specific API key)
- Set the spending ceiling
- Optionally configure reset interval, velocity limits, session limits, and alert thresholds
- Click Set Budget — takes effect immediately
API
Budget creation and management uses session authentication (dashboard). See the Budgets API for full endpoint documentation.
# Requires dashboard session cookie
curl -X POST "https://nullspend.dev/api/budgets" \
-H "Cookie: session=..." \
-H "Content-Type: application/json" \
-d '{
"entityType": "api_key",
"entityId": "ns_key_11223344-5566-7788-99aa-bbccddeeff00",
"maxBudgetMicrodollars": 50000000,
"resetInterval": "monthly",
"velocityLimitMicrodollars": 10000000,
"velocityWindowSeconds": 60,
"velocityCooldownSeconds": 60,
"sessionLimitMicrodollars": 5000000
}'To check budget status programmatically (with an API key), use GET /api/budgets/status.
Best Practices
- Start generous, tighten later. Set initial budgets higher than expected. Once you have cost data, tighten with confidence.
- One budget per concern. Separate API keys (and budgets) for different agents, environments, or teams.
- Use session limits for multi-step agents. Cap each task's cost so a single stuck agent can't consume the entire budget.
- Monitor before enforcing. Use the analytics dashboard to understand spending patterns before setting tight ceilings.
- Combine velocity + session limits. Velocity catches sudden spikes; session limits catch slow accumulation over a long conversation.
Related
- Budget Configuration Guide — step-by-step setup walkthrough
- Cost Tracking — how costs are calculated and recorded
- Tags — tag-based cost attribution and tag budgets
- Webhook Event Types — budget.exceeded, velocity.exceeded, threshold alerts
- Error Reference — all 429 error codes and response shapes
Cost Attribution
Break down AI spend by customer, team, feature, or any dimension. Answer 'how much does each customer cost me?' with per-key and per-tag grouping.
Velocity Limits
Velocity limits catch runaway loops — an agent stuck in a retry cycle can burn through a budget in seconds before a human can react. They add a spending-rate