Session Limits
Session limits cap how much a single agent conversation can spend, regardless of the overall budget. A runaway agent loop that stays under velocity limits can s
Session limits cap how much a single agent conversation can spend, regardless of the overall budget. A runaway agent loop that stays under velocity limits can still accumulate significant cost across a long session — session limits provide a per-conversation ceiling.
See Budgets for overall budget configuration.
How It Works
Request arrives with X-NullSpend-Session header
│
▼
Lookup session spend in DO SQLite
│
▼
currentSpend + estimate > sessionLimit?
├─ NO → Continue to velocity + budget checks
│
▼
429 (no Retry-After) → agent should start a new sessionIf the X-NullSpend-Session header is absent, session limit enforcement is skipped entirely.
Configuration
Set this field when creating or updating a budget via the API:
| Field | Type | Range | Default |
|---|---|---|---|
sessionLimitMicrodollars | integer or null | > 0 | null (disabled) |
Setting sessionLimitMicrodollars to null disables session limit enforcement for that budget entity.
Setting the Session Header
Send the X-NullSpend-Session header with each request to identify the conversation:
TypeScript:
const response = await fetch("https://proxy.nullspend.dev/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${apiKey}`,
"X-NullSpend-Session": "task-042",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello" }],
}),
});Python:
response = requests.post(
"https://proxy.nullspend.dev/v1/chat/completions",
headers={
"Authorization": f"Bearer {api_key}",
"X-NullSpend-Session": "task-042",
"Content-Type": "application/json",
},
json={
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello"}],
},
)Claude Agent SDK:
const options = withNullSpend({
apiKey: "ns_live_sk_...",
budgetSessionId: "task-042", // NOT the SDK's conversation sessionId
});The proxy rejects the request with 400 bad_request if the session ID exceeds 256 characters. Choose short, meaningful IDs.
Session Tracking
Session spend is tracked in the Durable Object's SQLite database:
| Column | Type | Description |
|---|---|---|
entity_key | text | Budget entity (user:{id} or api_key:{id}) |
session_id | text | Your session identifier |
spend | integer | Cumulative spend in microdollars |
request_count | integer | Number of requests in this session |
last_seen | integer | Timestamp of last request (ms) |
Lifecycle:
- Reservation: When a request is approved, the estimated cost is added to
spend - Reconciliation: When the actual cost is known, the delta (
actual - estimate) is applied:spend = MAX(0, spend + delta) - Expired reservations: If a reservation expires without reconciliation (crash/timeout), the DO alarm reverses the reservation from session spend
- Cleanup: Sessions with
last_seenolder than 24 hours are deleted by the DO alarm
429 Response
When a session limit is exceeded, the proxy returns:
{
"error": {
"code": "session_limit_exceeded",
"message": "Request blocked: session spend exceeds session limit. Start a new session.",
"details": {
"session_id": "conv_abc123",
"session_spend_microdollars": 4800000,
"session_limit_microdollars": 5000000
}
}
}No Retry-After header. Unlike velocity limits, the session is done — retrying won't help. The agent should start a new session (new X-NullSpend-Session value) to continue.
Webhooks
session.limit_exceeded
Fires when a request is denied because the session spend cap is reached. Key fields in data.object:
| Field | Description |
|---|---|
budget_entity_type | user or api_key |
budget_entity_id | The entity whose session limit was hit |
session_id | The session that exceeded the limit |
session_spend_microdollars | Current session spend at denial time |
session_limit_microdollars | Configured session limit |
model | Model of the denied request |
provider | openai or anthropic |
blocked_at | ISO 8601 timestamp |
See Event Types for the full JSON example.
Key Behaviors
- No header = no enforcement. Session limits only apply when
X-NullSpend-Sessionis present on the request. - Client-defined sessions. The proxy never creates, invalidates, or manages session IDs — your agent decides when to start a new session.
- Independent of budget resets. Session spend does NOT reset when the budget period resets. A session that spans a daily reset carries its full cumulative spend.
- Always strict. Session limits are hard caps regardless of the budget policy (
warndoes not apply). - 24-hour cleanup. Stale session data is automatically cleaned up after 24 hours of inactivity via the DO alarm.
Enforcement Order
The enforcement pipeline runs in this order:
- Period reset — if the budget period has elapsed, reset spend before any checks run
- Session limit — deny before touching velocity counters
- Velocity limit — sliding window + circuit breaker
- Budget exhaustion — is there enough budget remaining?
- Reservation — reserve estimated cost
Session is checked before velocity so that denied requests don't inflate velocity counters or affect budget accounting.
Example
Scenario: $5 session limit, agent conversation "task-042".
- Agent starts conversation "task-042", sending
X-NullSpend-Session: task-042 - First 10 requests cost $0.45 each — session spend reaches $4.50
- Request 11 has an estimated cost of $0.60
$4.50 + $0.60 = $5.10 > $5.00→ deniedsession.limit_exceededwebhook fires- Agent receives
429withsession_limit_exceedederror - Agent starts a new conversation with
X-NullSpend-Session: task-043 - New session starts at $0 spend — requests resume
Related
- Budgets — overall budget configuration and enforcement
- Velocity Limits — spending-rate circuit breaker
- Event Types — full webhook payload examples
- Errors — all error codes and response shapes
- Claude Agent SDK —
budgetSessionIdoption
Velocity Limits
Velocity limits catch runaway loops — an agent stuck in a retry cycle can burn through a budget in seconds before a human can react. They add a spending-rate
Tags
Tags let you attribute costs to teams, environments, features, or anything else. Attach a JSON object to any request and query costs by those dimensions in the