AI context windows

Every model’s context window, in one place

Largest current context window: 10,000,000 tokens — Meta Llama 4 Scout. Across 103 models from 8 providers.

As of May 9, 2026.

Largest current

10,000,000

Llama 4 Scout

Median current

228,000

tokens, current models

2020 baseline

2,048

tokens (GPT-3)

Growth since 2020

~4,883×

on the maximum

Maximum context window over time

One marker per public model release, color-coded by provider. The dashed line traces the running maximum across all providers — the “largest context window in production” at each point in time. Y-axis is log-scale so a 5,000× range still reads.

AnthropicOpenAIGooglexAIMetaDeepSeekMistralAlibaba

Every model

Sort by any column. Filter by provider or by minimum context window.

Min context:

Llama 4 Scout

10M context, the published max as of 2026-05-09.

Meta(Llama)

10M

—

Apr 5, 2025

Available

Gemini 1.5 (002 refresh, 2M)

First mainstream 2M context.

Google(Gemini)

8.2K

Sep 24, 2024

Legacy

Gemini 2.0 Pro Experimental

Google(Gemini)

8.2K

Feb 5, 2025

Available

Gemini 3 Pro

Gemini 3 series flagship.

Google(Gemini)

65.5K

Nov 18, 2025

Current

Gemini 3.1 Pro

Google(Gemini)

65.5K

Feb 19, 2026

Current

Grok 4.20

First xAI 2M production model.

xAI(Grok)

—

Mar 10, 2026

Current

Gemini 2.5 Pro

Stable 1M, 2M roadmap.

Google(Gemini)

65.5K

Mar 25, 2025

Available

Gemini 2.5 Flash

Google(Gemini)

65.5K

Jun 17, 2025

Available

Gemini 2.5 Flash-Lite

Google(Gemini)

65.5K

Jul 22, 2025

Available

Gemini 3 Flash

Google(Gemini)

65.5K

Dec 17, 2025

Current

Gemini 3.1 Flash-Lite

Google(Gemini)

65.5K

Mar 3, 2026

Current

Gemini 1.5 Pro

First mainstream 1M context.

Google(Gemini)

8.2K

Feb 15, 2024

Legacy

Gemini 1.5 Flash

Google(Gemini)

8.2K

May 14, 2024

Legacy

Gemini 1.5 Flash-8B

Google(Gemini)

8.2K

Oct 3, 2024

Legacy

Gemini 2.0 Flash

Google(Gemini)

8.2K

Dec 11, 2024

Available

Grok 3

First xAI 1M context.

xAI(Grok)

—

Feb 17, 2025

Available

Llama 4 Maverick

Meta(Llama)

—

Apr 5, 2025

Available

GPT-4.1

First OpenAI 1M-context production model.

OpenAI(ChatGPT)

32.8K

Apr 14, 2025

Available

Qwen3-Max

Alibaba(Qwen)

—

Sep 15, 2025

Available

Qwen 3.6-Max-Preview

Proprietary preview, DashScope only.

Alibaba(Qwen)

—

Apr 2, 2026

Current

Muse Spark

Closed weights; API-only.

Meta(Llama)

—

Apr 8, 2026

Current

DeepSeek-V4-Pro

First DeepSeek 1M; Thinking + Non-Thinking modes.

DeepSeek(DeepSeek)

—

Apr 24, 2026

Current

DeepSeek-V4-Flash

DeepSeek(DeepSeek)

—

Apr 24, 2026

Current

GPT-5

Unified router across reasoning + chat.

OpenAI(ChatGPT)

400K

128K

Aug 7, 2025

Current

GPT-5.1

OpenAI(ChatGPT)

400K

128K

Nov 12, 2025

Current

GPT-5.2

OpenAI(ChatGPT)

400K

128K

Dec 11, 2025

Current

GPT-5.3-Codex

Code-tuned variant.

OpenAI(ChatGPT)

400K

128K

Feb 5, 2026

Current

GPT-5.4

OpenAI(ChatGPT)

400K

128K

Mar 5, 2026

Current

GPT-5.5

OpenAI(ChatGPT)

400K

128K

Apr 23, 2026

Current

Qwen3.5 family + Plus

1M extended via YaRN.

Alibaba(Qwen)

262.1K

—

Feb 16, 2026

Available

Qwen3.6-27B

Hybrid Gated DeltaNet + self-attention; 1M extensible.

Alibaba(Qwen)

262.1K

—

Apr 22, 2026

Current

Grok 4

xAI(Grok)

256K

—

Jul 9, 2025

Available

Grok 4.1

xAI(Grok)

256K

—

Nov 17, 2025

Current

Grok 4.1 Fast

xAI(Grok)

256K

—

Nov 19, 2025

Current

Mistral Large 3

Granular MoE flagship, Apache 2.0.

Mistral(Mistral)

256K

—

Dec 2, 2025

Current

Claude 2.1

First 200K window.

Anthropic(Claude)

200K

—

Nov 21, 2023

Legacy

Claude 3 Opus

Anthropic(Claude)

200K

4.1K

Mar 4, 2024

Available

Claude 3 Sonnet

Anthropic(Claude)

200K

4.1K

Mar 4, 2024

Legacy

Claude 3 Haiku

Anthropic(Claude)

200K

4.1K

Mar 13, 2024

Available

Claude 3.5 Sonnet

Anthropic(Claude)

200K

8.2K

Jun 20, 2024

Legacy

Claude 3.5 Sonnet (new)

Computer-use beta.

Anthropic(Claude)

200K

8.2K

Oct 22, 2024

Available

Claude 3.5 Haiku

Anthropic(Claude)

200K

8.2K

Nov 4, 2024

Available

First production o-series.

OpenAI(ChatGPT)

200K

100K

Dec 5, 2024

Available

o3-mini

OpenAI(ChatGPT)

200K

100K

Jan 31, 2025

Available

Claude 3.7 Sonnet

Extended thinking.

Anthropic(Claude)

200K

64K

Feb 24, 2025

Available

OpenAI(ChatGPT)

200K

100K

Apr 16, 2025

Available

o4-mini

OpenAI(ChatGPT)

200K

100K

Apr 16, 2025

Available

Claude Opus 4

Anthropic(Claude)

200K

32K

May 22, 2025

Available

Claude Sonnet 4

1M beta on Vertex/Bedrock.

Anthropic(Claude)

200K

64K

May 22, 2025

Available

Claude Opus 4.1

Anthropic(Claude)

200K

32K

Aug 5, 2025

Available

Claude Sonnet 4.5

1M beta.

Anthropic(Claude)

200K

64K

Sep 29, 2025

Available

Claude Haiku 4.5

Anthropic(Claude)

200K

32K

Oct 15, 2025

Available

Claude Opus 4.5

Anthropic(Claude)

200K

64K

Nov 24, 2025

Available

Claude Opus 4.6

1M beta.

Anthropic(Claude)

200K

64K

Feb 5, 2026

Available

Claude Sonnet 4.6

1M beta.

Anthropic(Claude)

200K

64K

Feb 17, 2026

Available

Claude Opus 4.7

1M beta on Vertex/Bedrock.

Anthropic(Claude)

200K

64K

Apr 16, 2026

Current

Grok 1.5

xAI(Grok)

131.1K

—

Mar 28, 2024

Legacy

Qwen2 family

Apache 2.0 turn.

Alibaba(Qwen)

131.1K

—

Jun 7, 2024

Legacy

Llama 3.1 (8B / 70B / 405B)

First Llama 128K.

Meta(Llama)

131.1K

—

Jul 23, 2024

Available

Grok 2

xAI(Grok)

131.1K

—

Aug 13, 2024

Legacy

Qwen2.5 family

Alibaba(Qwen)

131.1K

—

Sep 19, 2024

Available

Llama 3.2 (Vision + Edge)

Meta(Llama)

131.1K

—

Sep 25, 2024

Available

Llama 3.3 70B Instruct

Meta(Llama)

131.1K

—

Dec 6, 2024

Available

Qwen3 family

Hybrid Thinking / Non-Thinking; 256K extensible.

Alibaba(Qwen)

131.1K

—

Apr 28, 2025

Available

GPT-4 Turbo

DevDay 2023; first OpenAI 128K.

OpenAI(ChatGPT)

128K

4.1K

Nov 6, 2023

Legacy

DeepSeek-V2

DeepSeek(DeepSeek)

128K

—

May 6, 2024

Legacy

GPT-4o

Multimodal flagship.

OpenAI(ChatGPT)

128K

16.4K

May 13, 2024

Available

GPT-4o mini

Replaced GPT-3.5 Turbo on the mini tier.

OpenAI(ChatGPT)

128K

16.4K

Jul 18, 2024

Available

Mistral Large 2

Mistral(Mistral)

128K

—

Jul 24, 2024

Legacy

o1-preview

Reasoning track preview.

OpenAI(ChatGPT)

128K

32.8K

Sep 12, 2024

Legacy

DeepSeek-V3

DeepSeek(DeepSeek)

128K

—

Dec 26, 2024

Available

DeepSeek-R1

Reasoning track flagship.

DeepSeek(DeepSeek)

128K

—

Jan 20, 2025

Available

DeepSeek-V3.1

DeepSeek(DeepSeek)

128K

—

Aug 21, 2025

Available

DeepSeek-V3.2 (+ Speciale)

DeepSeek(DeepSeek)

128K

—

Dec 1, 2025

Available

Mistral Small 4

Mistral(Mistral)

128K

—

Mar 16, 2026

Current

Claude 2

First mainstream 100K window.

Anthropic(Claude)

100K

—

Jul 11, 2023

Legacy

Claude Instant 1.2

Anthropic(Claude)

100K

—

Aug 9, 2023

Legacy

GPT-4-32K

Extended GPT-4 variant.

OpenAI(ChatGPT)

32.8K

—

Jun 13, 2023

Legacy

Mistral 7B

Sliding-window attention.

Mistral(Mistral)

32.8K

—

Sep 27, 2023

Legacy

Gemini 1.0 Pro

Google(Gemini)

32.8K

8.2K

Dec 6, 2023

Legacy

Mixtral 8x7B

Mistral(Mistral)

32.8K

—

Dec 11, 2023

Legacy

Gemini 1.0 Ultra

Gemini Advanced launch.

Google(Gemini)

32.8K

8.2K

Feb 8, 2024

Legacy

Mistral Large

Mistral(Mistral)

32.8K

—

Feb 26, 2024

Legacy

Mistral Small 3

Mistral(Mistral)

32.8K

—

Jan 30, 2025

Available

GPT-3.5 Turbo 16K

First mainstream 16K model.

OpenAI(ChatGPT)

16.4K

—

Jun 13, 2023

Legacy

Code Llama

Code-specialized.

Meta(Llama)

16.4K

—

Aug 24, 2023

Legacy

DeepSeek-Coder

DeepSeek(DeepSeek)

16.4K

—

Nov 2, 2023

Legacy

Claude 1

Anthropic(Claude)

—

Mar 14, 2023

Legacy

GPT-4

32K variant later in 2023.

OpenAI(ChatGPT)

8.2K

—

Mar 14, 2023

Legacy

Bard (LaMDA-based)

Pre-Gemini chat product.

Google(Gemini)

8.2K

—

Mar 21, 2023

Legacy

PaLM 2

Google(Gemini)

8.2K

—

May 10, 2023

Legacy

Qwen-7B

Alibaba(Qwen)

8.2K

—

Aug 3, 2023

Legacy

Qwen-14B

Alibaba(Qwen)

8.2K

—

Sep 25, 2023

Legacy

Grok 1

xAI(Grok)

8.2K

—

Nov 4, 2023

Legacy

Llama 3 (8B / 70B)

Meta(Llama)

8.2K

—

Apr 18, 2024

Legacy

InstructGPT (text-davinci-002)

RLHF-tuned.

OpenAI(ChatGPT)

4.1K

—

Jan 27, 2022

Legacy

GPT-3.5 / ChatGPT launch

ChatGPT launch; later 16K variant.

OpenAI(ChatGPT)

4.1K

—

Nov 30, 2022

Legacy

Llama 2

Meta(Llama)

4.1K

—

Jul 18, 2023

Legacy

DeepSeek-LLM 7B / 67B

DeepSeek(DeepSeek)

4.1K

—

Nov 2, 2023

Legacy

GPT-3

The 2020 baseline.

OpenAI(ChatGPT)

—

May 28, 2020

Legacy

LLaMA 1

Meta(Llama)

—

Mar 3, 2023

Legacy

GPT-2

Doubled GPT-1.

OpenAI(ChatGPT)

—

Feb 14, 2019

Legacy

GPT-1

Original transformer paper baseline.

OpenAI(ChatGPT)

512

—

Jun 11, 2018

Legacy

Showing all 103 models.

By provider

Each provider’s context-window history at a glance. The current maximum and the lifetime maximum may differ when a provider has rolled an early extended-context experiment back into a smaller production window.

Anthropic

Claude family

Versions →

Current

200K

Claude Opus 4.7

Lifetime max

200K

First model

Mar 14, 2023

Most recent

Apr 16, 2026Claude Opus 4.7200K

Feb 17, 2026Claude Sonnet 4.6200K

Feb 5, 2026Claude Opus 4.6200K

Nov 24, 2025Claude Opus 4.5200K

Oct 15, 2025Claude Haiku 4.5200K

Sep 29, 2025Claude Sonnet 4.5200K

OpenAI

ChatGPT family

Versions →

Current

400K

GPT-5.5

Lifetime max

First model

512

Jun 11, 2018

Most recent

Apr 23, 2026GPT-5.5400K

Mar 5, 2026GPT-5.4400K

Feb 5, 2026GPT-5.3-Codex400K

Dec 11, 2025GPT-5.2400K

Nov 12, 2025GPT-5.1400K

Aug 7, 2025GPT-5400K

Google

Gemini family

Versions →

Current

Gemini 3.1 Flash-Lite

Lifetime max

First model

8.2K

Mar 21, 2023

Most recent

Mar 3, 2026Gemini 3.1 Flash-Lite1M

Feb 19, 2026Gemini 3.1 Pro2M

Dec 17, 2025Gemini 3 Flash1M

Nov 18, 2025Gemini 3 Pro2M

Jul 22, 2025Gemini 2.5 Flash-Lite1M

Jun 17, 2025Gemini 2.5 Flash1M

xAI

Grok family

Versions →

Current

Grok 4.20

Lifetime max

First model

8.2K

Nov 4, 2023

Most recent

Mar 10, 2026Grok 4.202M

Nov 19, 2025Grok 4.1 Fast256K

Nov 17, 2025Grok 4.1256K

Jul 9, 2025Grok 4256K

Feb 17, 2025Grok 31M

Aug 13, 2024Grok 2131.1K

Notes and caveats

Effective vs. nominal context. Several long-context models advertise large windows but degrade past a certain length on real-world tasks — this page records the documented nominal capacity, not benchmark-measured effective length. The latter is benchmark-dependent and out of scope per the section’s no-benchmarks rule.

Input vs. output limits. Most providers document a single “context window” that includes both prompt and response tokens. A few (notably the OpenAI o-series and the Anthropic Claude 4 generation) document a separate output-token cap. Where separately documented, the Output column shows it; otherwise the row treats the input window as the total budget.

Beta and tier-gated context. Some providers ship a default context size for the standard API and a larger one behind a beta flag, batch endpoint, or paid tier. The headline number on this page is the standard-API value documented as generally available; the per-row notes call out when a beta or tier-gated extended window exists.

Open-weights inference. For open-weights models (Llama, DeepSeek, Mistral, Qwen) the “context window” is the value the model card claims; serving infrastructure (vLLM, Together, Fireworks, Hugging Face Inference) often caps the deployed window lower for memory reasons. Always check the specific endpoint’s docs before relying on the full nominal window.

Tokenizer differences. One token is not a fixed unit across providers. OpenAI’s o200k tokenizer, Anthropic’s tokenizer, Google’s SentencePiece, and Meta’s tiktoken-derived tokenizers all produce different token counts for identical text. Compare context windows in tokens, not in characters or pages, but treat them as a same-provider apples-to-apples comparison rather than a strict cross-provider one.

About this page

Cross-family comparison page in the /ai/ section. Each row’s context-window value is sourced from the provider’s own model documentation — OpenAI’s platform.openai.com/docs/models, Anthropic’s docs.claude.com, Google’s ai.google.dev, xAI’s docs.x.ai, Meta’s llama.com and huggingface.co/meta-llama, DeepSeek’s api-docs.deepseek.com, Mistral’s docs.mistral.ai, and Alibaba’s help.aliyun.com/zh/dashscope.

The model roster mirrors the per-family pages already on this site — Claude, ChatGPT, Gemini, Grok, Llama, DeepSeek, Mistral, Qwen — so each row links back to the matching version-page entry for the full per-release context.

Refreshed monthly. Each refresh re-verifies every row against the provider’s current documentation; values that changed since the previous run are updated and the row’s “as of” date is bumped. See release cadence for the cross-family ship-cadence picture this page complements.

Last updated: May 9, 2026. 103 models · 8 providers.