NullSpend Docs

Budgets

Budgets are spending ceilings enforced by the proxy. When the estimated cost of a request would push spend over the limit, the proxy returns `429` — the reque

Budgets are spending ceilings enforced by the proxy. When the estimated cost of a request would push spend over the limit, the proxy returns 429 — the request never reaches the provider and you are not charged.

For a task-oriented setup guide, see Budget Configuration.

How Budgets Work

Request arrives

    ├─ 1. Estimate cost (input tokens + max output tokens × 1.1 safety margin)

    ├─ 2. Period reset ────────── due? ───────► Reset spend to 0, start new period

    ├─ 3. Session limit check ─── exceeds? ──► 429 session_limit_exceeded

    ├─ 4. Velocity check ──────── tripped? ───► 429 velocity_exceeded + Retry-After

    ├─ 4.5 Finalization reserve ── finalize header + in zone? ► unlock reserve

    ├─ 5. Budget check ────────── exceeds? ──► 429 budget_exceeded

    ├─ 6. Reserve estimated cost (30s TTL)

    ├─ 7. Forward request to provider

    ├─ 8. Receive response, calculate actual cost

    └─ 9. Reconcile: apply actual cost, release reservation

Budget enforcement uses a Cloudflare Durable Object with embedded SQLite. All checks and mutations are serialized — no race conditions, even under concurrent load.

Budget Entity Types

Entity TypeWhat It Scopes To
userAll requests from a user account (across all their API keys)
api_keyRequests from a specific API key
tagRequests carrying a specific tag key-value pair (e.g., env=production)
customerRequests carrying a specific customer ID via X-NullSpend-Customer header

A single request can match multiple budgets (e.g., a user budget + an API key budget + a customer budget). All matching budgets must have sufficient remaining balance for the request to proceed.

Configuration

FieldTypeDescription
maxBudgetMicrodollarsinteger (required)Spending ceiling in microdollars. $50 = 50,000,000
policystringEnforcement policy: "strict_block" (default, denies requests), "soft_block" (logs but allows), or "warn" (tracks only).
resetIntervalstring or null"daily", "weekly", "monthly", or null (no reset — manual only)
thresholdPercentagesinteger[]Webhook alert thresholds. Default: [50, 80, 90, 95]. Max 10 values, must be ascending, each 1–100.
velocityLimitMicrodollarsinteger or nullMax spend per velocity window. See Velocity Limits.
velocityWindowSecondsintegerSliding window size. Range: 10–3600. Default: 60.
velocityCooldownSecondsintegerBlock duration after velocity trip. Range: 10–3600. Default: 60.
sessionLimitMicrodollarsinteger or nullPer-session spending cap. See Session Limits.
finalizationReserveMicrodollarsinteger or nullPortion of budget held back for graceful agent shutdown. When set, normal requests are denied when they'd eat into the reserve. Requests with X-NullSpend-Finalize: 1 can use the reserve once spend reaches the reserve zone. See Finalization Reserve.

Enforcement Lifecycle

The proxy checks budgets in this exact order. A denial at any step stops the pipeline — later steps are not evaluated.

1. Period Reset

If the budget has a resetInterval and the current period has elapsed, spend resets to 0 and a new period starts. A budget.reset webhook fires.

2. Session Limit Check

If the budget has a sessionLimitMicrodollars and the request includes an X-NullSpend-Session header, the proxy checks cumulative spend for that session. If currentSessionSpend + estimatedCost > sessionLimit, the request is denied.

3. Velocity Check (Circuit Breaker)

If the budget has a velocityLimitMicrodollars, the proxy checks spend within the sliding window. The velocity check uses a circuit breaker pattern:

  • Closed (normal): requests pass through, velocity spend is tracked
  • Open (tripped): all requests are denied until the cooldown expires
  • Recovery: after cooldown, the breaker resets and a velocity.recovered webhook fires

If estimatedSpend + estimate > velocityLimit, the breaker trips.

4. Budget Exhaustion Check

If currentSpend + reservations + estimatedCost > maxBudget, the request is denied. Only budgets with strict_block policy deny requests (this is the default).

5. Reservation

The estimated cost is reserved for 30 seconds. Reservations prevent concurrent requests from collectively exceeding the budget. If a reservation expires (upstream timeout, crash), it is automatically cleaned up.

6. Reconciliation

After the provider responds, the proxy calculates the actual cost and reconciles:

  • Adds actual cost to cumulative spend
  • Removes the reservation
  • Adjusts session spend by actualCost - estimatedCost

429 Response Bodies

Budget Exceeded

{
  "error": {
    "code": "budget_exceeded",
    "message": "Request blocked: estimated cost exceeds remaining budget.",
    "details": {
      "entity_type": "user",
      "entity_id": "ns_usr_...",
      "budget_limit_microdollars": 100000000,
      "budget_spend_microdollars": 95000000,
      "estimated_cost_microdollars": 8000000,
      "finalization_reserve_microdollars": 10000000,
      "finalization_remaining_microdollars": 0
    }
  }
}

The finalization_reserve_microdollars and finalization_remaining_microdollars fields are only present when the budget has a finalization reserve configured. finalization_remaining_microdollars is the budget remaining after subtracting both spend and reserve.

Velocity Exceeded

{
  "error": {
    "code": "velocity_exceeded",
    "message": "Request blocked: spending rate exceeds velocity limit. Retry after cooldown.",
    "details": {
      "limitMicrodollars": 10000000,
      "windowSeconds": 60,
      "currentMicrodollars": 9500000
    }
  }
}

The response includes a Retry-After header with the cooldown duration in seconds.

Session Limit Exceeded

{
  "error": {
    "code": "session_limit_exceeded",
    "message": "Request blocked: session spend exceeds session limit. Start a new session.",
    "details": {
      "session_id": "conv_abc123",
      "session_spend_microdollars": 4800000,
      "session_limit_microdollars": 5000000
    }
  }
}

No Retry-After header — the session is done. Start a new session (new X-NullSpend-Session value) to continue.

Tag Budget Exceeded

{
  "error": {
    "code": "tag_budget_exceeded",
    "message": "Request blocked: tag budget exceeded",
    "details": {
      "tag_key": "team",
      "tag_value": "billing",
      "budget_limit_microdollars": 50000000,
      "budget_spend_microdollars": 49500000
    }
  }
}

Velocity Limits

Velocity limits catch runaway loops — an agent stuck in a retry cycle can burn through a budget in seconds.

How it works:

  1. The proxy tracks spend within a sliding window (e.g., $10 in 60 seconds)
  2. When spend exceeds the limit, a circuit breaker trips
  3. All requests are blocked for the cooldown period
  4. After cooldown, the breaker resets and requests resume
  5. A velocity.recovered webhook fires on recovery

Configuration:

FieldRangeDefaultDescription
velocityLimitMicrodollars> 0null (disabled)Max spend per window
velocityWindowSeconds10–360060Sliding window size
velocityCooldownSeconds10–360060Block duration after trip

Example: $10 velocity limit with 60s window and 60s cooldown means: if your agents spend more than $10 within any 60-second sliding window, all requests are blocked for 60 seconds.

For the full reference — sliding window algorithm, circuit breaker states, and webhook payloads — see Velocity Limits.

Session Limits

Session limits cap how much a single agent conversation can spend, regardless of the overall budget.

How it works:

  1. Your agent sets X-NullSpend-Session: conv_abc123 on each request
  2. The proxy tracks cumulative spend per session ID
  3. When a session's spend exceeds the limit, the request is blocked
  4. The agent should start a new session (new ID) to continue

Key behaviors:

  • No header = no enforcement. Session limits only apply when X-NullSpend-Session is present.
  • Client-defined sessions. The proxy does not manage session lifecycle — your agent decides when to start a new session.
  • Independent of budget resets. Session spend does NOT reset when the budget period resets.
  • Always strict. Session limits are hard caps regardless of the budget policy.
  • 24-hour cleanup. Stale session data is automatically cleaned up after 24 hours of inactivity.

For the full reference — session tracking internals, header usage, and webhook payloads — see Session Limits.

Finalization Reserve

When an agent is near its budget limit, it gets hard-killed mid-task with a 429. Finalization reserve holds back a configurable portion of the budget so the agent can finish gracefully.

How it works:

  1. You set a finalizationReserveMicrodollars on the budget (e.g., $5 reserve on a $100 budget)
  2. Normal requests are denied when spend + reservations + estimate > limit - reserve (effective limit = $95)
  3. When the agent detects it's near the wall (via response headers), it sets X-NullSpend-Finalize: 1 on its final request
  4. The proxy checks if the entity is in the "reserve zone" (spend + reservations >= limit - reserve). If yes, the reserve is unlocked for that request
  5. If the entity is NOT in the reserve zone, the finalize header is ignored (prevents premature reserve spending)

Key behaviors:

  • Server-enforced zone gate. The X-NullSpend-Finalize header only works when the entity has actually reached the reserve zone. Setting it on every request does nothing until you're near the limit.
  • Only applies to strict_block budgets. soft_block and warn budgets don't enforce the reserve (they don't enforce limits at all).
  • Response headers show remaining. Every response includes X-NullSpend-Budget-Effective-Remaining and X-NullSpend-Budget-Finalization-Reserve headers when a reserve is configured.
  • Requests-Remaining estimate. X-NullSpend-Budget-Requests-Remaining shows approximately how many more requests fit in the effective remaining, based on a rolling average of recent request costs.

Dashboard:

The budget form includes a collapsible "Finalization reserve" section. When set, the budget list shows a two-zone progress bar: green for normal spend and amber for the reserve zone. A shield icon indicates budgets with active reserves.

SDK support:

Both the TypeScript and Python SDKs support finalization reserve:

// TypeScript: finalize a request via the proxy
const trackedFetch = ns.createTrackedFetch("openai", {
  finalize: true,  // Injects X-NullSpend-Finalize: 1
});
# Python: finalize a request via the proxy
tracked = ns.create_tracked_client("openai", finalize=True)

The SDK's cooperative budget check also subtracts the reserve from remaining (for strict_block budgets only), and skips the subtraction when finalize: true.

Threshold Alerts

When spend crosses a threshold percentage, a webhook fires:

  • Thresholds ≥ 90% fire as budget.threshold.critical
  • Thresholds < 90% fire as budget.threshold.warning

Default thresholds are [50, 80, 90, 95]. Customize per budget with up to 10 values (ascending, each 1–100).

See Webhook Event Types for payload details.

Creating a Budget

Dashboard

  1. Go to BudgetsSet Budget
  2. Choose entity (your account or a specific API key)
  3. Set the spending ceiling
  4. Optionally configure reset interval, velocity limits, session limits, finalization reserve, and alert thresholds
  5. Click Set Budget — takes effect immediately

API

Budget creation and management uses session authentication (dashboard). See the Budgets API for full endpoint documentation.

# Requires dashboard session cookie
curl -X POST "https://nullspend.dev/api/budgets" \
  -H "Cookie: session=..." \
  -H "Content-Type: application/json" \
  -d '{
    "entityType": "api_key",
    "entityId": "ns_key_11223344-5566-7788-99aa-bbccddeeff00",
    "maxBudgetMicrodollars": 50000000,
    "resetInterval": "monthly",
    "velocityLimitMicrodollars": 10000000,
    "velocityWindowSeconds": 60,
    "velocityCooldownSeconds": 60,
    "sessionLimitMicrodollars": 5000000
  }'

To check budget status programmatically (with an API key), use GET /api/budgets/status.

Best Practices

  • Start generous, tighten later. Set initial budgets higher than expected. Once you have cost data, tighten with confidence.
  • One budget per concern. Separate API keys (and budgets) for different agents, environments, or teams.
  • Use session limits for multi-step agents. Cap each task's cost so a single stuck agent can't consume the entire budget.
  • Monitor before enforcing. Use the analytics dashboard to understand spending patterns before setting tight ceilings.
  • Combine velocity + session limits. Velocity catches sudden spikes; session limits catch slow accumulation over a long conversation.
  • Use finalization reserve for multi-step agents. Set a reserve large enough for one cleanup request so agents can save state, send notifications, or close connections before shutting down.

On this page