AI model lineage

How the major AI models are actually related

Per-provider family trees for 86 models across 8 providers, with 80 typed edges — successions, post-trainings, fine-tunes, distillations, retrainings, and the GPT line / o-series merge into GPT-5.

As of May 9, 2026.

Current production flagships

One node per provider, color-coded. Click any flagship to jump into the per-provider family tree below.

How to read the trees

Each box is a model. Box fill indicates lifecycle status (Current emerald, Available sky, Legacy amber, Deprecated rose). Each box has a left-edge stripe in the provider color. Each arrow is a typed edge:

Succession

Next release in the family timeline. Provider has not formally documented base-model continuity, so no claim is made about whether the new release is a fine-tune, post-training, or full retraining of the previous one.

Post-training

Provider documents this release as a post-training update over the same base model — RLHF, instruction tuning, behavior changes — without a new base.

Fine-tune

Provider documents this release as a fine-tune of the source model (often for a domain — code, vision, instruction-following).

Distilled from

Provider documents this release as a student model distilled from a larger teacher model in the same family.

New base

Provider explicitly documents this release as a new base model, retrained from scratch or with substantial architecture / data changes that make it not a fine-tune of the previous flagship.

Lines merged

Provider unifies two or more previously-separate model tracks into one release (e.g. OpenAI merging the GPT line and the o-series reasoning track into GPT-5).

Hover any node or edge for its full label, ship date, and (for edges) the source the relationship was sourced to. The "Lineage as text" block under each tree contains the same information for screen readers and crawlers.

Anthropic · Claude

Versions page →

Lineage as text (14 edges) ↓

Every edge in the Claude tree above, in chronological order. Each line shows: destination model, edge type, source model, and the provider documentation that establishes the relationship (where one was disclosed).

Claude 2 (Jul 11, 2023) — succession of Claude 1
Claude Instant 1.2 (Aug 9, 2023) — distilled from of Claude 1 — Anthropic positioned Claude Instant as a smaller, faster sibling of the Claude flagship; sourced to the Claude 2 announcement.
Claude 2.1 (Nov 21, 2023) — post-training of Claude 2 — Anthropic documented Claude 2.1 as an iterative update on Claude 2 with the 200K context extension, not a new base.
Claude 3 Opus (Mar 4, 2024) — new base of Claude 2.1 — Anthropic introduced the Claude 3 family as a new generation of base models, not a Claude 2 fine-tune.
Claude 3 Sonnet (Mar 4, 2024) — succession of Claude 3 Opus — Anthropic shipped Opus / Sonnet / Haiku as the three tiers of the Claude 3 generation simultaneously.
Claude 3 Haiku (Mar 13, 2024) — succession of Claude 3 Opus
Claude 3.5 Sonnet (Jun 20, 2024) — succession of Claude 3 Sonnet
3.5 Sonnet (new) (Oct 22, 2024) — post-training of Claude 3.5 Sonnet — Anthropic documented the October 2024 'new 3.5 Sonnet' as an upgraded version under the same model id.
Claude 3.5 Haiku (Nov 4, 2024) — distilled from of 3.5 Sonnet (new) — Anthropic documented Claude 3.5 Haiku as a smaller sibling sharing the 3.5 generation training.
Claude 3.7 Sonnet (Feb 24, 2025) — succession of 3.5 Sonnet (new)
Claude Sonnet 4 (May 22, 2025) — new base of Claude 3.7 Sonnet — Anthropic introduced Claude 4 as a new generation, with Opus 4 and Sonnet 4 as new bases rather than fine-tunes of 3.7.
Claude Opus 4 (May 22, 2025) — new base of Claude 3.7 Sonnet
Claude Opus 4.7 (Apr 16, 2026) — succession of Claude Opus 4
Claude Opus 4.7 (Apr 16, 2026) — succession of Claude Sonnet 4

OpenAI · ChatGPT

Versions page →

Lineage as text (14 edges) ↓

Every edge in the ChatGPT tree above, in chronological order. Each line shows: destination model, edge type, source model, and the provider documentation that establishes the relationship (where one was disclosed).

GPT-3.5 (Nov 30, 2022) — post-training of GPT-3 — OpenAI's GPT-3.5 (text-davinci series and ChatGPT launch) was an RLHF-fine-tuned descendant of the GPT-3 base, per OpenAI's InstructGPT paper and the ChatGPT launch post.
GPT-4 (Mar 14, 2023) — new base of GPT-3.5 — OpenAI introduced GPT-4 as a new base model, not a GPT-3.5 fine-tune.
GPT-4 Turbo (Nov 6, 2023) — post-training of GPT-4 — OpenAI documented GPT-4 Turbo at DevDay 2023 as the same generation with extended context and updated training data.
GPT-4o (May 13, 2024) — new base of GPT-4 Turbo — OpenAI introduced GPT-4o as a new natively-multimodal base model.
GPT-4o mini (Jul 18, 2024) — distilled from of GPT-4o — OpenAI positioned GPT-4o mini as the smaller, distilled sibling of GPT-4o.
o1-preview (Sep 12, 2024) — new base of GPT-4o — OpenAI introduced the o-series as a separate reasoning track; o1 was trained to use long chain-of-thought at inference time.
o1 (Dec 5, 2024) — succession of o1-preview
o3-mini (Jan 31, 2025) — succession of o1
GPT-4.1 (Apr 14, 2025) — succession of GPT-4o
o4-mini (Apr 16, 2025) — distilled from of o3 — OpenAI positioned o4-mini as the smaller sibling shipped alongside o3.
o3 (Apr 16, 2025) — succession of o3-mini
GPT-5 (Aug 7, 2025) — lines merged of GPT-4.1 — OpenAI introduced GPT-5 as a unified router that combines the GPT chat line and the o-series reasoning track into one model id.
GPT-5 (Aug 7, 2025) — lines merged of o3
GPT-5.5 (Apr 23, 2026) — succession of GPT-5

Google · Gemini

Versions page →

Lineage as text (14 edges) ↓

Every edge in the Gemini tree above, in chronological order. Each line shows: destination model, edge type, source model, and the provider documentation that establishes the relationship (where one was disclosed).

PaLM 2 (May 10, 2023) — new base of Bard (LaMDA) — Google introduced PaLM 2 as a new base model powering an upgraded Bard.
Gemini 1.0 Pro (Dec 6, 2023) — new base of PaLM 2 — Google's Gemini 1.0 was a new natively-multimodal architecture, not a PaLM 2 fine-tune.
Gemini 1.0 Ultra (Feb 8, 2024) — succession of Gemini 1.0 Pro
Gemini 1.5 Pro (Feb 15, 2024) — new base of Gemini 1.0 Ultra — Google introduced Gemini 1.5 with a new mixture-of-experts architecture and the 1M context, not a 1.0 fine-tune.
Gemini 1.5 Flash (May 14, 2024) — distilled from of Gemini 1.5 Pro — Google described Gemini 1.5 Flash as a distilled smaller sibling of 1.5 Pro.
Gemini 2.0 Flash (Dec 11, 2024) — succession of Gemini 1.5 Flash
Gemini 2.0 Pro Exp (Feb 5, 2025) — succession of Gemini 1.5 Pro
Gemini 2.5 Pro (Mar 25, 2025) — succession of Gemini 2.0 Pro Exp
Gemini 2.5 Flash (Jun 17, 2025) — succession of Gemini 2.0 Flash
2.5 Flash-Lite (Jul 22, 2025) — distilled from of Gemini 2.5 Flash — Google positioned Flash-Lite as the smallest sibling of the 2.5 Flash family.
Gemini 3 Pro (Nov 18, 2025) — new base of Gemini 2.5 Pro — Google introduced Gemini 3 as a new generation flagship.
Gemini 3 Flash (Dec 17, 2025) — succession of Gemini 2.5 Flash
Gemini 3.1 Pro (Feb 19, 2026) — succession of Gemini 3 Pro
3.1 Flash-Lite (Mar 3, 2026) — distilled from of Gemini 3 Flash

xAI · Grok

Versions page →

Lineage as text (7 edges) ↓

Every edge in the Grok tree above, in chronological order. Each line shows: destination model, edge type, source model, and the provider documentation that establishes the relationship (where one was disclosed).

Grok 1.5 (Mar 28, 2024) — post-training of Grok 1 — xAI documented Grok 1.5 as a successor that extended Grok 1's context and reasoning, broadly within the same generation.
Grok 2 (Aug 13, 2024) — new base of Grok 1.5 — xAI introduced Grok 2 as a new base.
Grok 3 (Feb 17, 2025) — new base of Grok 2 — xAI introduced Grok 3 trained on the Memphis Colossus cluster as a new base.
Grok 4 (Jul 9, 2025) — new base of Grok 3
Grok 4.1 (Nov 17, 2025) — post-training of Grok 4
Grok 4.1 Fast (Nov 19, 2025) — distilled from of Grok 4.1 — xAI positioned Grok 4.1 Fast as a smaller, faster sibling shipped two days after 4.1.
Grok 4.20 (Mar 10, 2026) — succession of Grok 4.1

Meta · Llama

Versions page →

Lineage as text (9 edges) ↓

Every edge in the Llama tree above, in chronological order. Each line shows: destination model, edge type, source model, and the provider documentation that establishes the relationship (where one was disclosed).

Llama 2 (Jul 18, 2023) — new base of LLaMA 1 — Meta's Llama 2 paper documents it as a new pretraining run with substantially larger data and the open license shift.
Code Llama (Aug 24, 2023) — fine-tune of Llama 2 — Meta's Code Llama paper documents it as a fine-tune of Llama 2 on code data.
Llama 3 (Apr 18, 2024) — new base of Llama 2 — Meta's Llama 3 release notes document it as a new pretraining run with updated tokenizer.
Llama 3.1 (Jul 23, 2024) — post-training of Llama 3 — Meta's Llama 3.1 release added the 405B size and the long-context post-training; documented as same generation.
Llama 3.2 (Sep 25, 2024) — post-training of Llama 3.1 — Meta's Llama 3.2 release added vision and edge-sized variants.
Llama 3.3 70B (Dec 6, 2024) — post-training of Llama 3.2
Llama 4 Scout (Apr 5, 2025) — new base of Llama 3.3 70B — Meta's Llama 4 release introduced a new mixture-of-experts architecture as a new base.
Llama 4 Maverick (Apr 5, 2025) — new base of Llama 3.3 70B
Muse Spark (Apr 8, 2026) — new base of Llama 4 Maverick — Meta described Muse Spark as the closed-weights successor to the Llama line, trained at Meta Superintelligence Labs.

DeepSeek · DeepSeek

Versions page →

Lineage as text (8 edges) ↓

Every edge in the DeepSeek tree above, in chronological order. Each line shows: destination model, edge type, source model, and the provider documentation that establishes the relationship (where one was disclosed).

DeepSeek-Coder (Nov 2, 2023) — fine-tune of DeepSeek-LLM — DeepSeek-Coder shipped alongside the base LLM as a code-specialized fine-tune.
DeepSeek-V2 (May 6, 2024) — new base of DeepSeek-LLM — DeepSeek-V2 paper introduced the DeepSeekMoE architecture and Multi-head Latent Attention as new base.
DeepSeek-V3 (Dec 26, 2024) — new base of DeepSeek-V2 — DeepSeek-V3 paper introduced the 671B-parameter MoE as a new base.
DeepSeek-R1 (Jan 20, 2025) — post-training of DeepSeek-V3 — DeepSeek-R1 paper documents R1 as RL post-training over the V3 base — RL-only emergent chain-of-thought.
DeepSeek-V3.1 (Aug 21, 2025) — post-training of DeepSeek-V3
DeepSeek-V3.2 (Dec 1, 2025) — post-training of DeepSeek-V3.1
DeepSeek-V4-Pro (Apr 24, 2026) — new base of DeepSeek-V3.2 — DeepSeek-V4 introduced a new MoE base with Thinking and Non-Thinking modes.
DeepSeek-V4-Flash (Apr 24, 2026) — distilled from of DeepSeek-V4-Pro — DeepSeek positioned V4-Flash as the smaller sibling shipped alongside V4-Pro.

Mistral · Mistral

Versions page →

Lineage as text (6 edges) ↓

Every edge in the Mistral tree above, in chronological order. Each line shows: destination model, edge type, source model, and the provider documentation that establishes the relationship (where one was disclosed).

Mixtral 8x7B (Dec 11, 2023) — new base of Mistral 7B — Mixtral 8x7B introduced the sparse mixture-of-experts architecture as a new base, not a Mistral 7B fine-tune.
Mistral Large (Feb 26, 2024) — new base of Mixtral 8x7B — Mistral Large was released as a proprietary flagship under the Mistral Research / commercial license, not a Mixtral fine-tune.
Mistral Large 2 (Jul 24, 2024) — new base of Mistral Large
Mistral Small 3 (Jan 30, 2025) — new base of Mistral Large 2 — Mistral Small 3 marked the December 2025 'Mistral 3' family relaunch under Apache 2.0 with new bases.
Mistral Large 3 (Dec 2, 2025) — succession of Mistral Small 3
Mistral Small 4 (Mar 16, 2026) — succession of Mistral Large 3

Alibaba · Qwen

Versions page →

Lineage as text (8 edges) ↓

Every edge in the Qwen tree above, in chronological order. Each line shows: destination model, edge type, source model, and the provider documentation that establishes the relationship (where one was disclosed).

Qwen-14B (Sep 25, 2023) — succession of Qwen-7B
Qwen2 family (Jun 7, 2024) — new base of Qwen-14B — Qwen2 marked the June 2024 architecture refresh with the Apache 2.0 licensing turn for most variants.
Qwen2.5 family (Sep 19, 2024) — post-training of Qwen2 family
Qwen3 family (Apr 28, 2025) — new base of Qwen2.5 family — Qwen3 introduced the hybrid Thinking / Non-Thinking architecture, absorbing the standalone QwQ reasoning track.
Qwen3-Max (Sep 15, 2025) — succession of Qwen3 family — Qwen3-Max is the largest Qwen3-generation flagship.
Qwen3.5 + Plus (Feb 16, 2026) — succession of Qwen3-Max
Qwen 3.6-Max (Apr 2, 2026) — succession of Qwen3.5 + Plus — Qwen 3.6-Max-Preview is the proprietary preview shipped alongside the open-weights 3.6 family.
Qwen3.6-27B (Apr 22, 2026) — new base of Qwen3.5 + Plus — Qwen3.6-27B introduced the Gated DeltaNet + self-attention hybrid as a new base.

About this page

Cross-family comparison page in the /ai/ section. Each per-provider tree was hand-laid-out from the per-family Claude, ChatGPT, Gemini, Grok, Llama, DeepSeek, Mistral, and Qwen Versions pages on this site, each row's lineage edge sourced to the provider's own model card, technical paper, or release announcement.

Conservative claims. Where a provider has not formally documented base-model continuity between two releases, the edge is labeled "succession" and no further claim is made. The page does not infer "X is a fine-tune of Y" from external speculation, leaderboard chatter, or press coverage. The "succession" edge is honest about what is and isn't disclosed; the per-edge tooltip and the "Lineage as text" block name the source for every more-specific edge type.

Per-provider focus, not encyclopedic. Each tree highlights the lineage-meaningful releases — the new bases, the major post-trainings, the visible fine-tunes, the line merges. Per-release minutiae (small variants, intermediate checkpoints, every refresh on every tier) live on the per-family Versions pages where they belong. The /ai/release-cadence/ and /ai/context-windows/ pages are the right place for per-release counts and metrics.

What is intentionally excluded. Open-source community fine-tunes (the Llama-derivatives ecosystem — Vicuna, WizardLM, Nous Hermes, etc.) are not on the page; they are downstream community work, not frontier-lab releases. Capability comparisons ("which line is best") are out of scope. Speculation about undisclosed base-model continuity is out of scope. Unreleased models (announced but never publicly shipped, like Llama 4 Behemoth) are not included.

Refreshed quarterly, aligned with the per-family Versions pages. Each refresh re-verifies every disclosed lineage source (model cards / papers / announcements move under the same URLs but their content changes), adds nodes for new flagship releases, and prunes any node that the provider has formally deprecated. See release cadence and context windows for the cross-family ship-cadence and context-budget pictures this page complements.

Last updated: May 9, 2026. 86 models · 8 providers · 80 typed edges.