AI API pricing history

Every frontier AI model’s per-token price, current and historical

The cheapest current frontier-tier input price is $0.18 per million tokens — Meta Llama 4 Scout (Together), as of June 3, 2026. That’s a 167× compression against GPT-4’s March 2023 launch price of $30 per million. Across 32 current models from 8 providers, with 81 distinct price points charted from 2020 onward.

As of June 3, 2026. Input pricing only in the headline; output, cached, and batch tiers all live in the table below.

Cheapest current frontier
$0.18 / $0.59
Llama 4 Scout (Together) (in / out per M)
Cheapest current small-tier
$0.10
Gemini 2.0 Flash input / M
2023 baseline (GPT-4)
$30.00
input / M at launch
Compression vs GPT-4
~167×
cheaper today

Per-token input price over time

One marker per published price point, color-coded by provider. The dashed line traces the running minimum across the frontier tier — the cheapest input-token price for a flagship-class model at each point in time. Y-axis is log-scale to fit the 6,000× range between $60/M davinci tokens and the sub-penny cache-hit floor.

$0.10$1$10$602020202120222023202420252026OpenAI · GPT-3 (text-davinci-003) · $60.00 / M input · Jun 11, 2020OpenAI · GPT-3.5 Turbo · $2.00 / M input · Mar 1, 2023OpenAI · GPT-4 · $30.00 / M input · Mar 14, 2023Anthropic · Claude 1 · $11.02 / M input · May 15, 2023Anthropic · Claude Instant · $1.63 / M input · May 15, 2023OpenAI · GPT-3.5 Turbo · $1.50 / M input · Jun 13, 2023OpenAI · GPT-4 32K · $60.00 / M input · Jun 13, 2023Anthropic · Claude 2 · $11.02 / M input · Jul 11, 2023Meta · Llama 2 70B (Together) · $0.90 / M input · Oct 1, 2023OpenAI · GPT-3.5 Turbo · $1.00 / M input · Nov 6, 2023OpenAI · GPT-4 Turbo · $10.00 / M input · Nov 6, 2023Anthropic · Claude Instant · $0.80 / M input · Nov 21, 2023Anthropic · Claude 2 · $8.00 / M input · Nov 21, 2023Mistral · Mistral Medium (original) · $2.50 / M input · Dec 11, 2023OpenAI · GPT-3.5 Turbo · $0.50 / M input · Jan 25, 2024Google · Gemini 1.0 Pro · $0.50 / M input · Feb 15, 2024Mistral · Mistral Large · $8.00 / M input · Feb 26, 2024Anthropic · Claude 3 Opus · $15.00 / M input · Mar 4, 2024Anthropic · Claude 3 Sonnet · $3.00 / M input · Mar 4, 2024Anthropic · Claude 3 Haiku · $0.25 / M input · Mar 13, 2024Meta · Llama 3 70B (Together) · $0.90 / M input · Apr 18, 2024DeepSeek · DeepSeek-V2 · $0.14 / M input · May 6, 2024OpenAI · GPT-4o · $5.00 / M input · May 13, 2024Google · Gemini 1.5 Pro · $7.00 / M input · May 14, 2024Google · Gemini 1.5 Flash · $0.075 / M input · May 14, 2024Alibaba · Qwen Plus · $0.26 / M input · Jun 6, 2024Alibaba · Qwen Max · $0.78 / M input · Jun 6, 2024Anthropic · Claude 3.5 Sonnet · $3.00 / M input · Jun 20, 2024OpenAI · GPT-4o mini · $0.15 / M input · Jul 18, 2024Meta · Llama 3.1 405B (Together) · $3.50 / M input · Jul 23, 2024Mistral · Mistral Large 2 · $3.00 / M input · Jul 24, 2024OpenAI · o1-mini · $3.00 / M input · Sep 12, 2024Google · Gemini 1.5 Pro · $1.25 / M input · Sep 24, 2024OpenAI · GPT-4o · $2.50 / M input · Oct 1, 2024Anthropic · Claude 3.5 Haiku · $1.00 / M input · Nov 4, 2024OpenAI · o1 · $15.00 / M input · Dec 5, 2024Meta · Llama 3.3 70B (Together) · $0.88 / M input · Dec 6, 2024Google · Gemini 2.0 Flash · $0.10 / M input · Dec 11, 2024xAI · Grok 2 · $2.00 / M input · Dec 12, 2024DeepSeek · DeepSeek-V3 · $0.14 / M input · Dec 26, 2024Mistral · Mistral Small 3 · $0.20 / M input · Jan 30, 2025OpenAI · o3-mini · $1.10 / M input · Jan 31, 2025DeepSeek · DeepSeek-V3 · $0.27 / M input · Feb 8, 2025DeepSeek · DeepSeek R1 · $0.55 / M input · Feb 8, 2025xAI · Grok 3 · $3.00 / M input · Feb 17, 2025xAI · Grok 3 Mini · $0.30 / M input · Feb 17, 2025Anthropic · Claude 3.7 Sonnet · $3.00 / M input · Feb 24, 2025Google · Gemini 2.5 Pro · $1.25 / M input · Mar 25, 2025Meta · Llama 4 Scout (Together) · $0.18 / M input · Apr 5, 2025Meta · Llama 4 Maverick (Together) · $0.27 / M input · Apr 5, 2025OpenAI · GPT-4.1 · $2.00 / M input · Apr 14, 2025OpenAI · o3 · $10.00 / M input · Apr 16, 2025OpenAI · o4-mini · $1.10 / M input · Apr 16, 2025Anthropic · Claude Opus 4 · $15.00 / M input · May 22, 2025Anthropic · Claude Sonnet 4 · $3.00 / M input · May 22, 2025OpenAI · o3 · $2.00 / M input · Jun 10, 2025OpenAI · o3-pro · $20.00 / M input · Jun 10, 2025Google · Gemini 2.5 Flash · $0.30 / M input · Jun 17, 2025xAI · Grok 4 · $3.00 / M input · Jul 9, 2025Google · Gemini 2.5 Flash-Lite · $0.10 / M input · Jul 22, 2025Anthropic · Claude Opus 4.1 · $15.00 / M input · Aug 5, 2025OpenAI · GPT-5 · $1.25 / M input · Aug 7, 2025Anthropic · Claude Sonnet 4.5 · $3.00 / M input · Sep 29, 2025Anthropic · Claude Haiku 4.5 · $1.00 / M input · Oct 15, 2025Google · Gemini 3 Pro · $1.25 / M input · Nov 18, 2025Mistral · Mistral Large 3 · $2.00 / M input · Dec 2, 2025Mistral · Mistral Medium 3 · $1.00 / M input · Dec 2, 2025Alibaba · Qwen3.6 Plus · $0.33 / M input · Feb 12, 2026Anthropic · Claude Sonnet 4.6 · $3.00 / M input · Feb 17, 2026Google · Gemini 3.1 Pro · $1.25 / M input · Feb 19, 2026OpenAI · GPT-5.4 · $2.50 / M input · Mar 5, 2026xAI · Grok 4.20 · $2.00 / M input · Mar 10, 2026Mistral · Mistral Small 3.1 · $0.20 / M input · Mar 16, 2026Anthropic · Claude Opus 4.7 · $5.00 / M input · Apr 16, 2026OpenAI · GPT-5.5 · $5.00 / M input · Apr 23, 2026DeepSeek · DeepSeek-V4-Flash · $0.14 / M input · Apr 24, 2026DeepSeek · DeepSeek-V4-Pro · $1.74 / M input · Apr 24, 2026xAI · Grok 4.3 · $1.25 / M input · Apr 30, 2026Alibaba · Qwen3.7 Max · $0.78 / M input · May 15, 2026Google · Gemini 3.5 Flash · $0.30 / M input · May 19, 2026Anthropic · Claude Opus 4.8 · $5.00 / M input · May 28, 2026$ / M input (log)
AnthropicOpenAIGooglexAIMetaDeepSeekMistralAlibaba

Hover or tap any dot for the model, the full price (input, output, cached), the effective date, lifecycle status, and the source link.

The frontier price-floor in fourteen moves

Every time the cheapest-frontier-input number on the chart above stepped down. Each entry links into the model’s row on the relevant /ai/<family>/versions/ page for the full release context.

Every model’s current price

Sort by any column. Filter by provider, by tier, or to current-only. The Cached column shows the prompt-caching tier where the provider documents one (Anthropic, OpenAI, Google have one; xAI, DeepSeek, Mistral, and Alibaba sit at varying coverage).

Gemini 1.5 Flash
Cheap tier launch.
Google(Small / cheap)
$0.075
$0.30
May 14, 2024
Legacy
Gemini 2.0 Flash
Held the $0.10 / $0.40 cheap-tier price through 2.5 Flash-Lite.
Google(Small / cheap)
$0.10
$0.40
Dec 11, 2024
Available
Gemini 2.5 Flash-Lite
Held the 2.0 Flash cheap-tier price.
Google(Small / cheap)
$0.10
$0.40
Jul 22, 2025
Available
DeepSeek-V2
The price point that triggered the China-side AI price war.
DeepSeek(Frontier)
$0.14
$0.28
May 6, 2024
Deprecated
DeepSeek-V4-Flash
Cheap-tier flagship; 98% cache discount.
DeepSeek(Small / cheap)
$0.14
$0.28
$0.0028
Apr 24, 2026
Current
GPT-4o mini
60% cheaper than GPT-3.5 Turbo; replaced the cheap tier.
OpenAI(Small / cheap)
$0.15
$0.60
$0.075
Jul 18, 2024
Available
Llama 4 Scout (Together)
10M-context open-weights flagship; Groq publishes the same tier at $0.11 / $0.34.
Meta(Frontier)
$0.18
$0.59
Apr 5, 2025
Current
Mistral Small 3
Re-Apache-2.0 era kickoff.
Mistral(Small / cheap)
$0.20
$0.60
Jan 30, 2025
Legacy
Mistral Small 3.1
Small tier held at the January 2025 reset price.
Mistral(Small / cheap)
$0.20
$0.60
Mar 16, 2026
Current
Claude 3 Haiku
Cheapest mainstream model at launch; an order of magnitude below GPT-3.5 Turbo.
Anthropic(Small / cheap)
$0.25
$1.25
Mar 13, 2024
Deprecated
Qwen Plus
Established the Plus tier on DashScope.
Alibaba(Frontier)
$0.26
$0.78
Jun 6, 2024
Available
DeepSeek-V3
Promotion ended; new permanent pricing. Off-peak (16:30–00:30 UTC): 50% off.
DeepSeek(Frontier)
$0.27
$1.10
$0.07
Feb 8, 2025
Deprecated
Llama 4 Maverick (Together)
MoE flagship; Groq lists $0.20 / $0.60.
Meta(Frontier)
$0.27
$0.85
Apr 5, 2025
Available
Grok 3 Mini
Cheapest xAI tier.
xAI(Small / cheap)
$0.30
$0.50
Feb 17, 2025
Deprecated
Gemini 2.5 Flash
Output stepped up vs 2.0 Flash to reflect added thinking capacity.
Google(Small / cheap)
$0.30
$2.50
Jun 17, 2025
Available
Gemini 3.5 Flash
Google's most intelligent model; Flash pricing held.
Google(Frontier)
$0.30
$2.50
May 19, 2026
Current
Qwen3.6 Plus
Plus tier stepped up vs Qwen2.5 Plus.
Alibaba(Frontier)
$0.33
$1.95
Feb 12, 2026
Available
GPT-3.5 Turbo
Further 50% cut on input, 25% cut on output.
OpenAI(Small / cheap)
$0.50
$1.50
Jan 25, 2024
Deprecated
Gemini 1.0 Pro
Paid tier opened after a free preview window.
Google(Frontier)
$0.50
$1.50
Feb 15, 2024
Deprecated
DeepSeek R1
Reasoning-track companion to V3. Off-peak: 75% off.
DeepSeek(Reasoning)
$0.55
$2.19
$0.14
Feb 8, 2025
Deprecated
Qwen Max
Established the Max tier on DashScope.
Alibaba(Frontier)
$0.78
$3.90
Jun 6, 2024
Available
Qwen3.7 Max
Max tier price held since 2024.
Alibaba(Frontier)
$0.78
$3.90
May 15, 2026
Current
Claude Instant
Cut alongside Claude 2.1 launch.
Anthropic(Small / cheap)
$0.80
$2.40
Nov 21, 2023
Deprecated
Llama 3.3 70B (Together)
Reference Together AI price.
Meta(Frontier)
$0.88
$0.88
Dec 6, 2024
Available
Llama 2 70B (Together)
Reference hosted price via Together AI; Llama is open-weights.
Meta(Frontier)
$0.90
$0.90
Oct 1, 2023
Deprecated
Llama 3 70B (Together)
Reference Together AI launch price.
Meta(Frontier)
$0.90
$0.90
Apr 18, 2024
Deprecated
Claude 3.5 Haiku
Raised Haiku from $0.25 / $1.25; the only mid-cycle Haiku tier increase.
Anthropic(Small / cheap)
$1.00
$5.00
$0.10
Nov 4, 2024
Deprecated
Claude Haiku 4.5
Haiku price held at $1 / $5 from 3.5 Haiku.
Anthropic(Small / cheap)
$1.00
$5.00
$0.10
Oct 15, 2025
Available
Mistral Medium 3
Sized between Small and Large.
Mistral(Frontier)
$1.00
$3.00
Dec 2, 2025
Current
o3-mini
Replaced o1-mini at one-third the cost.
OpenAI(Reasoning)
$1.10
$4.40
$0.55
Jan 31, 2025
Deprecated
o4-mini
Same shelf price as o3-mini.
OpenAI(Reasoning)
$1.10
$4.40
$0.28
Apr 16, 2025
Deprecated
Gemini 1.5 Pro
82% input cut, 76% output cut. >128K: $2.50 / $10.
Google(Frontier)
$1.25
$5.00
Sep 24, 2024
Deprecated
Gemini 2.5 Pro
≤200K context; >200K tier at $2.50 / $15.
Google(Frontier)
$1.25
$10.00
Mar 25, 2025
Available
GPT-5
Unified router across reasoning and chat; first OpenAI frontier under $2 input.
OpenAI(Frontier)
$1.25
$10.00
$0.12
Aug 7, 2025
Legacy
Gemini 3 Pro
Held the 2.5 Pro price.
Google(Frontier)
$1.25
$10.00
Nov 18, 2025
Deprecated
Gemini 3.1 Pro
Pro price held; >200K tier at $2.50 / $15.
Google(Frontier)
$1.25
$10.00
Feb 19, 2026
Available
Grok 4.3
Current flagship; sharpest xAI price drop yet — output tier moves below the field.
xAI(Frontier)
$1.25
$2.50
$0.12
Apr 30, 2026
Current
DeepSeek-V4-Pro
Dense reasoning flagship; 99% cache discount.
DeepSeek(Frontier)
$1.74
$5.66
$0.015
Apr 24, 2026
Current
Grok 2
First paid xAI API model; vision variant same price.
xAI(Frontier)
$2.00
$10.00
Dec 12, 2024
Deprecated
GPT-4.1
First OpenAI 1M-context production model; 75% prompt-cache discount.
OpenAI(Frontier)
$2.00
$8.00
$0.50
Apr 14, 2025
Legacy
o3
80% price cut three months post-launch.
OpenAI(Reasoning)
$2.00
$8.00
$0.50
Jun 10, 2025
Available
Mistral Large 3
Open-weights flagship under Apache 2.0; current Mistral frontier.
Mistral(Frontier)
$2.00
$6.00
Dec 2, 2025
Current
Grok 4.20
First xAI price reset since Grok 2.
xAI(Frontier)
$2.00
$6.00
$0.20
Mar 10, 2026
Legacy
Mistral Medium (original)
Mistral's original mid-tier on la Plateforme.
Mistral(Frontier)
$2.50
$7.50
Dec 11, 2023
Deprecated
GPT-4o
50% input cut, 33% output cut; prompt caching introduced.
OpenAI(Frontier)
$2.50
$10.00
$1.25
Oct 1, 2024
Available
GPT-5.4
1M-context production model.
OpenAI(Frontier)
$2.50
$10.00
$0.25
Mar 5, 2026
Available
Claude 3 Sonnet
Established the Sonnet $3/$15 price that has held through 4.x.
Anthropic(Frontier)
$3.00
$15.00
Mar 4, 2024
Deprecated
Claude 3.5 Sonnet
Prompt caching introduced; 90% cache discount.
Anthropic(Frontier)
$3.00
$15.00
$0.30
Jun 20, 2024
Deprecated
Mistral Large 2
62% input cut, 62% output cut vs Large.
Mistral(Frontier)
$3.00
$9.00
Jul 24, 2024
Deprecated
o1-mini
Cheap reasoning track companion to o1-preview.
OpenAI(Reasoning)
$3.00
$12.00
$1.50
Sep 12, 2024
Deprecated
Grok 3
First xAI 1M-context model.
xAI(Frontier)
$3.00
$15.00
$0.75
Feb 17, 2025
Deprecated
Claude 3.7 Sonnet
Extended-thinking mode; Sonnet price unchanged.
Anthropic(Frontier)
$3.00
$15.00
$0.30
Feb 24, 2025
Deprecated
Claude Sonnet 4
Sonnet price unchanged into the 4.x line.
Anthropic(Frontier)
$3.00
$15.00
$0.30
May 22, 2025
Deprecated
Grok 4
Held Grok 3 pricing.
xAI(Frontier)
$3.00
$15.00
$0.75
Jul 9, 2025
Deprecated
Claude Sonnet 4.5
1M-context beta on Vertex / Bedrock; standard pricing for ≤200K.
Anthropic(Frontier)
$3.00
$15.00
$0.30
Sep 29, 2025
Available
Claude Sonnet 4.6
1M-context default; Sonnet price unchanged.
Anthropic(Frontier)
$3.00
$15.00
$0.30
Feb 17, 2026
Current
Llama 3.1 405B (Together)
405B-parameter open-weights flagship hosted price via Together AI.
Meta(Frontier)
$3.50
$3.50
Jul 23, 2024
Available
Claude Opus 4.7
Opus tier reset from $15 / $75 to $5 / $25 — first Opus price cut since the tier was established in March 2024.
Anthropic(Frontier)
$5.00
$25.00
$0.50
Apr 16, 2026
Available
GPT-5.5
Current flagship; 90% cache discount.
OpenAI(Frontier)
$5.00
$30.00
$0.50
Apr 23, 2026
Current
Claude Opus 4.8
Same $5 / $25 tier as 4.7. Optional fast mode at $10 / $50.
Anthropic(Frontier)
$5.00
$25.00
$0.50
May 28, 2026
Current
Claude 2
Reduced alongside Claude 2.1 launch.
Anthropic(Frontier)
$8.00
$24.00
Nov 21, 2023
Deprecated
Mistral Large
Top-tier proprietary launch.
Mistral(Frontier)
$8.00
$24.00
Feb 26, 2024
Deprecated
GPT-4 Turbo
DevDay 2023; first 128K-context OpenAI model; 3x cheaper input vs GPT-4.
OpenAI(Frontier)
$10.00
$30.00
Nov 6, 2023
Deprecated
Claude 1
Original Claude API pricing.
Anthropic(Frontier)
$11.02
$32.68
May 15, 2023
Deprecated
Claude 3 Opus
Top-tier launch pricing established the Opus tier.
Anthropic(Frontier)
$15.00
$75.00
Mar 4, 2024
Deprecated
o1
First production reasoning model; reasoning tokens billed as output.
OpenAI(Reasoning)
$15.00
$60.00
$7.50
Dec 5, 2024
Deprecated
Claude Opus 4
Opus tier carried through from Claude 3.
Anthropic(Frontier)
$15.00
$75.00
$1.50
May 22, 2025
Deprecated
Claude Opus 4.1
Opus pricing held at $15/$75.
Anthropic(Frontier)
$15.00
$75.00
$1.50
Aug 5, 2025
Available
o3-pro
Higher-compute o3 variant; Responses API only.
OpenAI(Reasoning)
$20.00
$80.00
Jun 10, 2025
Available
GPT-4
Launch pricing for the 8K context model.
OpenAI(Frontier)
$30.00
$60.00
Mar 14, 2023
Deprecated
GPT-3 (text-davinci-003)
Davinci tier at launch; $0.06 per 1K tokens, no input/output split.
OpenAI(Frontier)
$60.00
$60.00
Jun 11, 2020
Deprecated
GPT-4 32K
Extended-context GPT-4.
OpenAI(Frontier)
$60.00
$120.00
Jun 13, 2023
Deprecated

Showing all 72 models.

By provider

Each provider’s current flagship, cheapest current input tier, and most-recent price-change log. Δ column shows the per-million input-price change from the same model’s prior price point; “tier hold” means the price held against the predecessor in the same family.

Anthropic
Claude family
Versions →
Current flagship
$5.00 / $25.00
Claude Opus 4.8
Cheapest current input
$1.00 / M
Claude Haiku 4.5
Recent price-change log
EffectiveModelIn / OutΔ
May 28, 2026Claude Opus 4.8$5.00 / $25.00launch
Apr 16, 2026Claude Opus 4.7$5.00 / $25.00launch
Feb 17, 2026Claude Sonnet 4.6$3.00 / $15.00launch
Oct 15, 2025Claude Haiku 4.5$1.00 / $5.00launch
Sep 29, 2025Claude Sonnet 4.5$3.00 / $15.00launch
Aug 5, 2025Claude Opus 4.1$15.00 / $75.00launch
May 22, 2025Claude Opus 4$15.00 / $75.00launch
OpenAI
ChatGPT family
Versions →
Current flagship
$5.00 / $30.00
GPT-5.5
Cheapest current input
$0.15 / M
GPT-4o mini
Recent price-change log
EffectiveModelIn / OutΔ
Apr 23, 2026GPT-5.5$5.00 / $30.00launch
Mar 5, 2026GPT-5.4$2.50 / $10.00launch
Aug 7, 2025GPT-5$1.25 / $10.00launch
Jun 10, 2025o3$2.00 / $8.00↓ 80%
Jun 10, 2025o3-pro$20.00 / $80.00launch
Apr 16, 2025o3$10.00 / $40.00launch
Apr 16, 2025o4-mini$1.10 / $4.40launch
Google
Gemini family
Versions →
Current flagship
$0.30 / $2.50
Gemini 3.5 Flash
Cheapest current input
$0.10 / M
Gemini 2.5 Flash-Lite
Recent price-change log
EffectiveModelIn / OutΔ
May 19, 2026Gemini 3.5 Flash$0.30 / $2.50launch
Feb 19, 2026Gemini 3.1 Pro$1.25 / $10.00launch
Nov 18, 2025Gemini 3 Pro$1.25 / $10.00launch
Jul 22, 2025Gemini 2.5 Flash-Lite$0.10 / $0.40launch
Jun 17, 2025Gemini 2.5 Flash$0.30 / $2.50launch
Mar 25, 2025Gemini 2.5 Pro$1.25 / $10.00launch
Dec 11, 2024Gemini 2.0 Flash$0.10 / $0.40launch
xAI
Grok family
Versions →
Current flagship
$1.25 / $2.50
Grok 4.3
Cheapest current input
$1.25 / M
Grok 4.3
Recent price-change log
EffectiveModelIn / OutΔ
Apr 30, 2026Grok 4.3$1.25 / $2.50launch
Mar 10, 2026Grok 4.20$2.00 / $6.00launch
Jul 9, 2025Grok 4$3.00 / $15.00launch
Feb 17, 2025Grok 3$3.00 / $15.00launch
Feb 17, 2025Grok 3 Mini$0.30 / $0.50launch
Dec 12, 2024Grok 2$2.00 / $10.00launch
Meta
Llama family
Versions →
Current flagship
$0.18 / $0.59
Llama 4 Scout (Together)
Cheapest current input
$0.18 / M
Llama 4 Scout (Together)
Recent price-change log
EffectiveModelIn / OutΔ
Apr 5, 2025Llama 4 Scout (Together)$0.18 / $0.59launch
Apr 5, 2025Llama 4 Maverick (Together)$0.27 / $0.85launch
Dec 6, 2024Llama 3.3 70B (Together)$0.88 / $0.88launch
Jul 23, 2024Llama 3.1 405B (Together)$3.50 / $3.50launch
Apr 18, 2024Llama 3 70B (Together)$0.90 / $0.90launch
Oct 1, 2023Llama 2 70B (Together)$0.90 / $0.90launch
DeepSeek
DeepSeek family
Versions →
Current flagship
$1.74 / $5.66
DeepSeek-V4-Pro
Cheapest current input
$0.14 / M
DeepSeek-V4-Flash
Recent price-change log
EffectiveModelIn / OutΔ
Apr 24, 2026DeepSeek-V4-Flash$0.14 / $0.28launch
Apr 24, 2026DeepSeek-V4-Pro$1.74 / $5.66launch
Feb 8, 2025DeepSeek-V3$0.27 / $1.10↑ 93%
Feb 8, 2025DeepSeek R1$0.55 / $2.19launch
Dec 26, 2024DeepSeek-V3$0.14 / $0.28launch
May 6, 2024DeepSeek-V2$0.14 / $0.28launch
Mistral
Mistral family
Versions →
Current flagship
$2.00 / $6.00
Mistral Large 3
Cheapest current input
$0.20 / M
Mistral Small 3.1
Recent price-change log
EffectiveModelIn / OutΔ
Mar 16, 2026Mistral Small 3.1$0.20 / $0.60launch
Dec 2, 2025Mistral Large 3$2.00 / $6.00launch
Dec 2, 2025Mistral Medium 3$1.00 / $3.00launch
Jan 30, 2025Mistral Small 3$0.20 / $0.60launch
Jul 24, 2024Mistral Large 2$3.00 / $9.00launch
Feb 26, 2024Mistral Large$8.00 / $24.00launch
Dec 11, 2023Mistral Medium (original)$2.50 / $7.50launch
Alibaba
Qwen family
Versions →
Current flagship
$0.78 / $3.90
Qwen3.7 Max
Cheapest current input
$0.26 / M
Qwen Plus
Recent price-change log
EffectiveModelIn / OutΔ
May 15, 2026Qwen3.7 Max$0.78 / $3.90launch
Feb 12, 2026Qwen3.6 Plus$0.33 / $1.95launch
Jun 6, 2024Qwen Plus$0.26 / $0.78launch
Jun 6, 2024Qwen Max$0.78 / $3.90launch

Notes and caveats

Headline tier is API list price. Every figure on this page is the public per-million-token rate the provider documents on its own pricing page. Negotiated enterprise pricing, committed-spend discounts, and special programs (OpenAI startup credits, Anthropic enterprise volume, Google Vertex AI region rates) are out of scope — this is a list-price reference.

Cached input pricing is a third axis that has grown more important since OpenAI introduced automatic prompt caching in October 2024 and Anthropic shipped explicit prompt caching in mid-2024. For workloads with a long stable prefix (agents replaying the same system prompt; RAG against a fixed corpus), the cached input rate dominates the effective per-call cost. The page surfaces it as a column rather than the headline number because non-cached workloads still see the headline rate.

Batch endpoints typically halve both input and output rates (OpenAI Batch, Anthropic Message Batches, Google Batch API) in exchange for asynchronous turnaround up to 24 hours. The row note flags batch availability per model; the headline number is the synchronous-API rate.

Open-weights pricing is a hosting-provider proxy. Meta’s Llama, DeepSeek, Mistral’s Apache-2.0-licensed releases, and Alibaba’s open-weights Qwen sizes have no “official” per-token price — the model is free to download. The row records the Together AI or Groq published rate as a reference. Cheaper rates exist on self-hosted inference (vLLM, sglang) or on cost-optimized providers (Fireworks, DeepInfra); more expensive ones exist on the hyperscalers (AWS Bedrock, Azure AI Foundry).

Reasoning tokens count as output. The o-series, Anthropic’s extended-thinking models, and Gemini’s 2.5 Flash thinking budget all bill reasoning tokens at the output rate. A “cheap-looking” reasoning model can be expensive in practice because each call generates more reasoning than visible output. The page records the per-token rate; the effective per-call cost is workload-dependent.

Tokenizer differences cross-provider. $1 / M tokens does not buy the same amount of text from OpenAI’s o200k tokenizer as it does from Anthropic’s claude-3 tokenizer or Google’s SentencePiece. Anthropic’s 4.7-era tokenizer in particular maps 1.0–1.35× the prior token count for the same input, which compresses the effective price improvement. Compare numbers in tokens, not in characters or pages, and treat them as a same-provider apples-to-apples rather than a strict cross-provider one.

Historical pricing is sourced from Wayback Machine and launch posts. Every effective-date entry links to a primary source — provider blog post, archived pricing page, or release announcement. Where the Wayback snapshot is the only viable cite (the provider has rewritten the page since), the URL points there directly.

About this page

Cross-family comparison page in the /ai/ section. Each row’s price is sourced from the provider’s own pricing page — OpenAI’s openai.com/api/pricing, Anthropic’s anthropic.com/pricing, Google’s ai.google.dev/gemini-api/docs/pricing, xAI’s docs.x.ai, DeepSeek’s api-docs.deepseek.com, Mistral’s mistral.ai/pricing, and Alibaba’s help.aliyun.com. Meta’s Llama models are open-weights; this page cites Together AI’s published rate as the reference hosting-provider price.

The model roster mirrors the per-family pages already on this site — Claude, ChatGPT, Gemini, Grok, Llama, DeepSeek, Mistral, Qwen — so each row links back to the matching version-page entry for the full per-release context. The cross-family context-windows, release-cadence, and benchmarks pages cover the other comparison axes.

Refreshed daily as part of the cross-family /ai/ sweep. Each refresh re-verifies every active row against the provider’s current pricing page; values that changed since the previous run get a new entry appended to the model’s price history and the row’s effective date is bumped. Stale rows for deprecated models are kept as historical record — the price history is the chart, not just the current rate.

Last updated: June 3, 2026. 72 models · 8 providers · 81 price points.

Last refreshed 2026-06-03 by Triton — cadence prose aligned to daily.