NullSpend Docs

Supported Models

NullSpend has pricing data for 38 models across OpenAI, Anthropic, and Google. The proxy routes requests to OpenAI and Anthropic; Google models are supported for cost calculation only.

NullSpend has pricing data for 38 models across OpenAI, Anthropic, and Google. The proxy routes requests to OpenAI and Anthropic. Google models are supported for cost calculation (via API ingest or the cost-engine package) but are not routable through the proxy.

Cost Formula

cost_microdollars = Math.round(Σ(tokens × rate_per_million_tokens))

Rates are in dollars per million tokens. The result is in microdollars (1 microdollar = $0.000001).

For the full calculation logic including cached tokens, cache writes, and long context multipliers, see Cost Tracking.

OpenAI Models

14 models. Rates in $/MTok.

ModelInputCached InputOutput
gpt-4o2.501.2510.00
gpt-4o-mini0.150.0750.60
gpt-4.12.000.508.00
gpt-4.1-mini0.400.101.60
gpt-4.1-nano0.100.0250.40
o4-mini1.100.2754.40
o32.000.508.00
o3-mini1.100.554.40
o115.007.5060.00
gpt-51.250.12510.00
gpt-5-mini0.250.0252.00
gpt-5-nano0.050.0050.40
gpt-5.11.250.12510.00
gpt-5.21.750.17514.00

OpenAI cost formula: (prompt_tokens - cached_tokens) × input + cached_tokens × cached + completion_tokens × output. Reasoning tokens are a subset of completion tokens — not double-counted.

Anthropic Models

22 models (10 aliases + 12 dated variants). Rates in $/MTok.

Aliases

ModelInputCached InputCache Write (5m)Cache Write (1h)Output
claude-opus-4-65.000.506.2510.0025.00
claude-opus-4-55.000.506.2510.0025.00
claude-opus-4-115.001.5018.7530.0075.00
claude-opus-415.001.5018.7530.0075.00
claude-sonnet-4-63.000.303.756.0015.00
claude-sonnet-4-53.000.303.756.0015.00
claude-sonnet-43.000.303.756.0015.00
claude-haiku-4-51.000.101.252.005.00
claude-haiku-3.50.800.081.001.604.00
claude-haiku-30.250.030.300.501.25

Dated Variants

Dated variants share the exact same rates as their alias:

ModelSame Rates As
claude-opus-4-6-20260205claude-opus-4-6
claude-sonnet-4-6-20260217claude-sonnet-4-6
claude-sonnet-4-5-20250929claude-sonnet-4-5
claude-opus-4-5-20251101claude-opus-4-5
claude-haiku-4-5-20251001claude-haiku-4-5
claude-opus-4-1-20250805claude-opus-4-1
claude-opus-4-20250514claude-opus-4
claude-sonnet-4-20250514claude-sonnet-4
claude-3-5-haiku-20241022claude-haiku-3.5
claude-3-haiku-20240307claude-haiku-3
claude-opus-4-0claude-opus-4
claude-sonnet-4-0claude-sonnet-4

Long Context Pricing

When total input tokens (input + cache creation + cache read) exceed 200,000 tokens, multipliers apply:

ComponentMultiplier
Input
Cached Input (read)
Cache Write (5m and 1h)
Output1.5×

Cache Write TTLs

Anthropic offers two cache write tiers:

TierTTLRate Column
Ephemeral (5-minute)5 minutesCache Write (5m)
Extended (1-hour)1 hourCache Write (1h)

If the response includes ephemeral_5m_input_tokens and ephemeral_1h_input_tokens, each is priced at its respective rate. Otherwise, all cache creation tokens use the 5-minute rate.

Google Models (cost calculation only)

2 models. Rates in $/MTok. These models are supported for cost calculation via API ingest and the cost-engine package, but the proxy does not route requests to Google.

ModelInputCached InputOutput
gemini-2.5-pro1.250.312510.00
gemini-2.5-flash0.150.03750.60

Unknown Models

If a request uses a model not in the pricing catalog, the proxy returns 400 with error code invalid_model. See the error reference for details.

To request a new model, contact support.

On this page