2023 – 2026

Qwen Versions

The latest Qwen flagship is Qwen3.7-Plus (May 31, 2026) — the multimodal agent flagship with vision + language understanding, deep reasoning, tool invocation, and autonomous iteration. Its text-only sibling, Qwen3.7-Max (announced May 20, 2026), is a closed-weights reasoning-agent model with a 1,000,000-token context window, native extended thinking, and vendor-stated benchmarks ahead of DeepSeek V4-Pro and Claude Opus 4.6 on agentic coding. The largest open-weights Qwen is Qwen3.6-27B (April 22, 2026; Apache 2.0, hybrid Gated DeltaNet + self-attention, 262K context extensible to 1M). I track every Qwen / Tongyi Qianwen release here — from Qwen-7B in August 2023 onward — with HuggingFace ids, ship dates, family (Flagship / Reasoning / Specialized), and license terms (Apache 2.0 / Tongyi-Qianwen / Proprietary). Below the table: the April 2023 Tongyi launch as Alibaba's ChatGPT response, the licensing turn at Qwen2, the QwQ reasoning track, the Qwen3 hybrid-reasoning era, the U.S. chip export-control context, and the HuggingFace-leaderboard dominance through 2025–2026.

Family & status

Family

Flagship — the main Qwen-N chat lineage including the proprietary Qwen-Max tier
Reasoning — the QwQ line; converged into the Flagship V-series at Qwen3's hybrid Thinking / Non-Thinking architecture
Specialized — Qwen-Coder, Qwen-VL, Qwen-Audio, Qwen-Math, Qwen-Omni, Qwen-Image

Status

Current — actively recommended; the latest in its product slot
Available — weights still served via HuggingFace, or proprietary API still served on DashScope, but superseded
Legacy — deprecated, retired from DashScope, or no longer recommended

Qwen version table

Model
Qwen3.7-Plus
qwen3.7-plus — closed-weights, Bailian / Model Studio (DashScope)
Flagship
Current
May 31, 2026
Multimodal agent flagship: vision + language (image and video understanding), deep reasoning, tool invocation, and autonomous iteration. The multimodal sibling to the text-only Qwen3.7-Max.
  • Announced on qwen.ai/research on May 31, 2026; coverage in MarkTechPost (June 2, 2026). Available via Alibaba Cloud Bailian / Model Studio (DashScope model string: qwen3.7-plus). Listed as the recommended multimodal model on the Alibaba Cloud Bailian platform.
  • Multimodal vision + language — understands images and video alongside text; visual understanding, not generation. The multimodal complement to the text-only Qwen3.7-Max; together they form the Qwen3.7 generation announced at the May 20, 2026 Alibaba Cloud Summit.
  • Five agentic capabilities: deep reasoning (step-by-step problem solving), self-programming (writes and revises its own code), tool invocation (calls external functions or APIs), verification and testing (runs outputs and checks results), and autonomous iteration (loops until the task is complete). Alibaba positions the model as a step in multimodal hybrid agent technology.
  • GUI agent integration on the Bailian platform: can operate graphical interfaces via screenshot understanding, handle browser-based tasks, and execute shell commands — the orchestration logic is baked into the model rather than the agent framework.
  • Vision Arena (LM Arena): the Qwen3.7-Plus-Preview ranked #16 overall, placing Alibaba as the #5 lab in vision among all competing AI labs at release time.
  • Closed-weights, proprietary. Available exclusively through Alibaba Cloud Bailian / Model Studio (DashScope) — no HuggingFace open-weights release announced as of June 2026. Alibaba's Bailian platform pairs the model with an Agentic RL mechanism that uses real-world execution feedback to refine accuracy, alongside built-in safety guardrails for autonomous tool use.
Model
Qwen3.7-Max
qwen3.7-max — closed-weights, DashScope + OpenRouter / Together / Qubrid
Flagship
Current
May 20, 2026
Reasoning-agent flagship. 1M-token context, native extended-thinking. Beats Claude Opus 4.6 on Terminal-Bench 2.0 / SWE-Bench Pro / MCP-Atlas. $2.50 / $7.50 per million tokens.
  • Announced at the Alibaba Cloud Summit in Hangzhou on May 20, 2026; commercial API live on Alibaba Cloud Model Studio one day earlier (May 19). Cross-listings on OpenRouter, Together AI, and Qubrid AI by day zero. Confirmed as the top of the chat-model tier on the Alibaba Cloud Model Studio Recommended models page (updated May 21, 2026).
  • 1,000,000-token context window and native extended-thinking mode — the first Qwen Max-tier model with both as defaults.
  • Vendor-stated benchmarks at launch: SWE-Bench Pro 60.6, Terminal-Bench 2.0 69.7, GPQA Diamond 92.4, Artificial Analysis Intelligence Index 56.6 — positioned ahead of DeepSeek V4-Pro and Claude Opus 4.6 on agentic-coding evaluations.
  • Closed-weights, proprietary — like the rest of the Max line, no HuggingFace release. The page's licensing card now lists three tiers: Apache 2.0 (the open Qwen3.6 family), Tongyi-Qianwen License (the older specialized weights), Proprietary (the Max line).
  • Pricing $2.50 / $7.50 per million input / output tokens — roughly half of Claude Opus 4.7's rate card. Vendor demos emphasized 35-hour autonomous-agent runs without performance degradation.
Model
Qwen3.6-27B
Qwen/Qwen3.6-27B
Flagship
Current
Apr 22, 2026
27B dense, Apache 2.0. Hybrid Gated DeltaNet + self-attention. Thinking Preservation. 262K context (extensible to 1M). Beats 397B MoE on coding.
  • Released April 22, 2026; the announcement is at qwen.ai/blog; HuggingFace card: Qwen/Qwen3.6-27B. Coverage in MarkTechPost and Simon Willison.
  • 27B fully dense — all parameters active on every inference pass, simplifying deployment vs. the MoE pattern most peer flagships use. The first fully dense flagship in the Qwen3.6 family.
  • Hybrid Gated DeltaNet + self-attention architecture; introduces Thinking Preservation, a mechanism that retains reasoning traces across conversation history to reduce redundant token generation in multi-turn agent workflows.
  • Default 262,144-token context, extensible to 1,010,000 tokens. Supports both multimodal thinking and non-thinking modes.
  • License: Apache 2.0. Per Alibaba's release post, scores 77.2 on SWE-bench Verified and 53.5 on SWE-bench Pro, beating the much larger Qwen3.5-397B-A17B on agentic-coding benchmarks at a 14× smaller active-parameter footprint.
  • Designed to run on a single consumer GPU; community testing reportedly hit ~80 tokens/second on a single RTX 5090 with a 218K-token context window.
Model
Qwen3.6-35B-A3B
Qwen/Qwen3.6-35B-A3B
Flagship
Current
Apr 16, 2026
First open-weights Qwen3.6 release: 35B total / 3B active MoE. Apache 2.0. The MoE companion to the dense Qwen3.6-27B shipped six days later.
  • Released April 16, 2026 on HuggingFace and ModelScope; HuggingFace card at Qwen/Qwen3.6-35B-A3B.
  • 35B-total / 3B-active MoE — the small-active-parameter sibling that pairs with the fully dense Qwen3.6-27B in the open-weights Qwen3.6 wave.
  • License: Apache 2.0. The first publicly-released Qwen3.6-family open-weights model, preceding the dense 27B by six days and the broader open-weights wave.
  • Designed for cost-efficient agentic-coding workloads where the active-parameter budget matters more than total scale; GGUF community-quantized builds followed shortly on Ollama and llama.cpp.
Model
Qwen 3.6-Max-Preview + Qwen 3.6-Plus + Qwen 3.6-Flash
qwen3.6-max-preview, qwen3.6-plus, qwen3.6-flash — closed-weights, DashScope
Flagship
Legacy
Apr 2, 2026
Proprietary closed-weights tier. OpenAI- and Anthropic-compatible API. Three SKUs from most capable to most cost-effective: Max-Preview, Plus, Flash — all current on the Model Studio recommended-models page (May 21, 2026 refresh).
  • Released April 2, 2026; coverage in Caixin Global and Alibaba Cloud Community.
  • Proprietary, closed-weights, DashScope-only. Per Alibaba's framing, the most powerful Qwen model shipped to date; tops six major coding benchmarks and posts gains on world-knowledge and instruction-following over Qwen 3.5-Plus.
  • Dual-API compatibility — the API is compatible with both OpenAI and Anthropic specifications, so existing pipelines can be re-pointed with minimal changes.
  • Available via Alibaba Cloud Model Studio. Pairs with the open-weights Qwen3.6-27B (above) as parallel tracks of the same flagship product slot — the open-weights track for self-host, the proprietary track for hosted commercial use.
Model
Qwen3.5 family + Plus
Qwen/Qwen3.5-{27B, 35B-A3B, 122B-A10B, 397B}, qwen3.5-plus (DashScope)
Flagship
Available
Feb 16, 2026
Multimodal MoE family. Open weights up to 397B. Native text + image + video. 201 languages. Hybrid Gated DeltaNet + sparse MoE.
  • Qwen3.5 (397B) and Qwen3.5-Plus released February 16, 2026; the smaller open-weights sizes (27B, 35B-A3B, 122B-A10B) followed February 24, 2026. Coverage in CNBC and SiliconANGLE.
  • First Qwen with native multimodality across text + image + video in a single architecture, with early-fusion training on trillions of multimodal tokens; outperformed Qwen3-VL on broad reasoning / coding / agents benchmarks per Alibaba's release post.
  • Hybrid Gated DeltaNet + sparse Mixture-of-Experts architecture — the open-weights flagship at 397B total parameters; the smaller MoE variants (35B-A3B, 122B-A10B) and the dense 27B target progressively smaller deployment footprints.
  • 201 languages and dialects — expanded from Qwen3's 119 languages, the broadest language coverage of any frontier-AI line.
  • Open weights under Apache 2.0; Qwen3.5-Plus is the proprietary hosted variant on DashScope / Model Studio.
Model
Qwen3-Coder-Next
Qwen/Qwen3-Coder-Next
Specialized
Available
Feb 4, 2026
Open-weights coding agent. 80B total / 3B active (Qwen3-Next architecture). Apache 2.0. 70%+ SWE-Bench Verified. 256K context, 10× throughput vs. dense 32B.
  • Released February 4, 2026; the announcement is at qwen.ai/blog; HuggingFace card: Qwen/Qwen3-Coder-Next; coverage in MarkTechPost and VentureBeat.
  • Built on Qwen3-Next-80B-A3B-Base — 80B total parameters with only 3B activated per inference step (ultra-sparse MoE + hybrid Gated DeltaNet + Gated Attention architecture). Specifically tuned for coding-agent workloads: long-horizon reasoning, complex tool use, and recovery from execution failures.
  • 256K native context, extensible to 1M tokens. Non-thinking-only mode (no think blocks in output) — optimized for deterministic agent pipelines rather than exploratory chain-of-thought.
  • Per Alibaba's release post: 70%+ on SWE-Bench Verified (SWE-Agent scaffold) and 44.3% on SWE-Bench Pro — competitive with the much larger Qwen3-Coder-480B on agentic-coding benchmarks at a dramatically lower active-parameter footprint.
  • License: Apache 2.0. More than 10× higher throughput than Qwen3-32B on context lengths exceeding 32K tokens, enabling cost-efficient deployment for repository-scale tasks.
Model
Qwen-Image-2512
Qwen/Qwen-Image-2512
Specialized
Legacy
Dec 2025
Open-weights text-to-image. Apache 2.0. Positioned as an open alternative to Google's Imagen for native image generation.
  • Released December 2025; coverage in Open Source For U.
  • Open-weights text-to-image model, Apache 2.0, distributed via the HuggingFace Qwen org.
  • Positioned by Alibaba as an open alternative to Google's Imagen 4 for native image generation; complements the multimodal capabilities of Qwen3-VL / Qwen3.5 / Qwen2.5-Omni in the same product family.
Model
Qwen3-Max
qwen3-max — closed-weights, DashScope only
Flagship
Legacy
Sep 2025
Trillion-parameter MoE flagship. Proprietary. 36T training tokens. Reportedly top-3 on LMArena alongside GPT-5-Chat at launch.
  • Released September 2025 as the company's first trillion-parameter Qwen flagship; coverage in eWeek.
  • ~1 trillion total parameters in a Mixture-of-Experts architecture; trained on 36 trillion tokens, twice Qwen2.5's pretraining corpus. Per Alibaba, the training run completed without loss spikes — an unusually stable trillion-parameter run.
  • Proprietary, closed-weights, DashScope-only. Per Alibaba's release post, Qwen3-Max-Instruct ranked consistently in the global top three on the LMArena text leaderboard at launch, surpassing GPT-5-Chat.
  • Superseded by the Qwen 3.6-Max-Preview proprietary preview in April 2026; status is Legacy on the “no longer the recommended Max-line model” reading, though the API remained served on DashScope.
Model
Qwen3-VL family
Qwen/Qwen3-VL-{2B, 4B, 8B, 32B, 30B-A3B, 235B-A22B}-Instruct
Specialized
Available
Sep 23, 2025
Vision-language family built on Qwen3. Dense (2B–32B) and MoE (30B-A3B / 235B-A22B) sizes. Instruct + Thinking editions. Apache 2.0. 256K context, image + video.
  • Qwen3-VL-235B-A22B released September 23, 2025; smaller sizes (30B-A3B: October 4; 4B/8B: October 15; 2B/32B: October 21) followed through October 2025. Technical report at arXiv 2511.21631; GitHub: QwenLM/Qwen3-VL; coverage by Simon Willison.
  • Six sizes — four dense (2B, 4B, 8B, 32B) and two MoE (30B-A3B and the flagship 235B-A22B) — each available in Instruct and Thinking editions, mirroring the Qwen3 flagship's hybrid-mode recipe at the vision-language layer.
  • 256K native interleaved context supporting text, images, and video in a single pass; dynamic-resolution visual encoder handles variable-aspect-ratio images and longer video clips.
  • Supersedes the Qwen2.5-VL family (January 2025) as the recommended open-weights vision-language line; the Qwen3.5 flagship's native multimodality later subsumed this specialized track for most use cases.
  • License: Apache 2.0 across all sizes.
Model
Qwen3-Omni
Qwen/Qwen3-Omni-30B-A3B-{Instruct, Thinking}
Specialized
Available
Sep 22, 2025
End-to-end omni-modal: text + image + audio + video in, text + real-time speech out. 30B-A3B MoE, Thinker–Talker architecture. Apache 2.0. SOTA on 22 of 36 audio/video benchmarks.
  • Released September 22, 2025; GitHub: QwenLM/Qwen3-Omni; HuggingFace collection: Qwen3-Omni; technical report: arXiv 2509.17765; API docs: DashScope (Qwen-Omni). Reached #1 on HuggingFace Trending on September 26, 2025.
  • Natively end-to-end omni-modal — processes text, images, audio, and video in a single architecture and delivers real-time streaming responses in both text and natural speech. The successor to Qwen2.5-Omni-7B (March 2025), with a fully redesigned architecture and significantly expanded scale.
  • Novel Thinker–Talker architecture: the Thinker component handles reasoning (chain-of-thought in both Instruct and Thinking modes); the Talker handles real-time speech synthesis with a multi-codebook design that drives latency to a minimum. The MoE base (30B total / 3B active) provides inference efficiency. A separate Thinking-only edition (Qwen/Qwen3-Omni-30B-A3B-Thinking) ships without the Talker for text-only reasoning pipelines.
  • Per the technical report: SOTA on 22 of 36 audio/video benchmarks; open-source SOTA on 32 of 36. Audio speech recognition and voice-conversation performance is comparable to Gemini 2.5 Pro per vendor-stated benchmarks. Supports 119 text languages, 19 speech-input languages, and 10 speech-output languages.
  • License: Apache 2.0 (confirmed on HuggingFace model card: license_name: apache-2.0). Available on HuggingFace, ModelScope, Qwen Chat (chat.qwen.ai), and via the DashScope API. A downstream fine-tune, Qwen3-Omni-30B-A3B-Captioner, ships as a detailed audio-captioning model built on the Instruct base.
  • Superseded as the recommended open-weights end-to-end omni model by the native multimodality introduced in Qwen3.5 (February 2026).
Model
Qwen3-Next-80B-A3B
Qwen/Qwen3-Next-80B-A3B-Instruct, Qwen/Qwen3-Next-80B-A3B-Thinking
Flagship
Available
Sep 11, 2025
Ultra-sparse MoE. 80B total / 3B active (3.7%). Hybrid Gated DeltaNet + Gated Attention. Apache 2.0. Matches Qwen3-235B-A22B at >10× throughput on long context. 256K native.
  • Released September 11, 2025; announcement at Alibaba Cloud Community; HuggingFace cards: Qwen/Qwen3-Next-80B-A3B-Instruct and Thinking.
  • Novel ultra-sparse MoE architecture: activates only 3B of 80B total parameters per inference step (3.7% activation ratio) via hybrid Gated DeltaNet + Gated Attention, a new attention design combining state-space-model-style recurrence with standard self-attention for ultra-long-context efficiency. The architecture later underpins both Qwen3-Next-80B-A3B (base) and Qwen3-Coder-Next (February 2026).
  • 256K native context, extensible to 1M tokens. Per Alibaba, surpasses Qwen3-235B-A22B on long-context benchmarks at more than 10× higher throughput for context lengths exceeding 32K tokens.
  • Surpasses the dense Qwen3-32B model on standard benchmarks while using less than 10% of its training compute (GPU-hours), demonstrating that ultra-sparse MoE can match large-scale dense models at a fraction of the cost.
  • License: Apache 2.0. Available in Instruct and Thinking editions on HuggingFace, ModelScope, Kaggle, and AWS Bedrock.
Model
Qwen3-Coder
Qwen/Qwen3-Coder-{480B-A35B, 30B-A3B}-Instruct
Specialized
Available
Jul 22, 2025
Open-weights agentic-coding flagship. 480B-A35B MoE. Apache 2.0. Reportedly competitive with Claude Sonnet 4 / GPT-4 on coding tasks.
  • Released July 22, 2025; the announcement is at qwenlm.github.io/blog/qwen3-coder; HuggingFace collection: Qwen3-Coder; coverage in MarkTechPost.
  • Qwen3-Coder-480B-A35B-Instruct — 480B total / 35B active per token MoE, the company's most powerful open-weights coding model. 256K context natively, extrapolated to 1M.
  • Qwen3-Coder-30B-A3B-Instruct — the smaller MoE variant for laptop-scale agentic-coding deployment.
  • License: Apache 2.0. Mistral characterized it as the “most agentic code model to date” with performance reportedly competitive with Claude Sonnet 4 and GPT-4 on coding tasks.
Model
Qwen3 family
Qwen/Qwen3-{0.6B, 1.7B, 4B, 8B, 14B, 32B, 30B-A3B, 235B-A22B}
Flagship
Legacy
Apr 28, 2025
Hybrid Thinking / Non-Thinking modes in a single architecture. 6 dense + 2 MoE sizes. 36T tokens, 119 languages. All Apache 2.0.
  • Released April 28, 2025; the announcement is at qwenlm.github.io/blog/qwen3; coverage in TechCrunch; GitHub: QwenLM/Qwen3.
  • First Qwen with hybrid Thinking / Non-Thinking modes in a single architecture — the model dynamically switches between fast direct responses and chain-of-thought reasoning based on the prompt. Absorbed the QwQ standalone-reasoning track into the Flagship V-series.
  • Six dense sizes (0.6B, 1.7B, 4B, 8B, 14B, 32B) and two MoE sizes (30B-A3B and 235B-A22B) released together. The flagship Qwen3-235B-A22B activates 22B per token from 235B total.
  • Trained on 36 trillion tokens — double Qwen2.5's pretraining corpus — with leading performance on translation and multilingual instruction-following across 119 languages and dialects.
  • License: Apache 2.0 across every size, the broadest fully-permissive Qwen launch to date.
  • Superseded as the recommended open-weights flagship by Qwen3.5 (February 2026) and Qwen3.6-27B (April 2026); status is Legacy.

The Qwen3 hybrid-reasoning era — April 28, 2025. Above this line: every Qwen flagship from Qwen3 onward ships with hybrid Thinking / Non-Thinking modes in a single architecture, absorbing the QwQ standalone-reasoning track. Below: the pre-hybrid lineage — Qwen / Qwen 1.5 / Qwen 2 / Qwen 2.5 chat models alongside the QwQ-32B-Preview (November 2024) and QwQ-32B (March 2025) reasoning experiments that proved the recipe later folded into Qwen3.

Model
QwQ-32B
Qwen/QwQ-32B
Reasoning
Available
Mar 5, 2025
Production reasoning model. 32B dense, Apache 2.0. Reported parity with DeepSeek-R1-671B at 20× fewer parameters on math + coding.
  • Released March 5, 2025; the announcement is at qwenlm.github.io/blog/qwq-32b; HuggingFace card: Qwen/QwQ-32B.
  • 32B dense reasoning model, Apache 2.0. Per Alibaba's testing, reportedly competitive with DeepSeek-R1 (671B total / 37B active) on AIME / MATH benchmarks at a 20× reduction in parameters.
  • The production successor to QwQ-32B-Preview; trained with reinforcement learning to incentivize chain-of-thought reasoning, parallel to the DeepSeek-R1 / Magistral / o1 recipe.
  • The last standalone Qwen reasoning model before the Qwen3 hybrid-mode convergence in April 2025; on this page's convergence reading, no further Reasoning-family rows are expected unless Alibaba revives the standalone QwQ track.
Model
Qwen2.5-Omni-7B
Qwen/Qwen2.5-Omni-7B
Specialized
Available
Mar 2025
End-to-end multimodal: text + image + audio + video in one architecture. 7B Apache 2.0. The first natively-omni Qwen.
  • Released March 2025; the announcement is at Alibaba Cloud Community.
  • First end-to-end multimodal Qwen — processes text, images, audio, and video in a single 7B architecture, with streaming responses across modalities.
  • Apache 2.0 license; HuggingFace at Qwen/Qwen2.5-Omni-7B.
  • Positioned as Alibaba's open answer to GPT-4o's omni-modality release; the recipe was generalized into Qwen3.5's native multimodality at the architecture level.
Model
Qwen2.5-VL family
Qwen/Qwen2.5-VL-{3B, 7B, 32B, 72B}-Instruct
Specialized
Available
Jan 2025
Vision-language models built on Qwen2.5. Four sizes (3B / 7B / 32B / 72B). Document parsing, video understanding. Apache 2.0.
  • Released January 2025; HuggingFace cards on the Qwen org.
  • Four sizes (3B, 7B, 32B, 72B) built on the Qwen2.5 base; designed for image / document / chart / video understanding.
  • Strong document-parsing and table-recognition results; 32B and 72B variants positioned as production-grade vision-language flagships.
  • License: Apache 2.0 for the smaller sizes, with the 72B variant following the same Qwen License pattern as Qwen2.5-72B.
Model
Qwen2.5-Max
qwen-max-2025-01-25 — closed-weights, DashScope only
Flagship
Legacy
Jan 29, 2025
First proprietary Qwen-Max. Large-scale MoE on 20T tokens. Pitched as a DeepSeek-V3 competitor at launch.
  • Released January 29, 2025 — two days after the DeepSeek-R1 Nvidia-stock-crash episode; the announcement is at qwenlm.github.io/blog/qwen2.5-max; coverage in SiliconANGLE and VentureBeat.
  • Large-scale Mixture-of-Experts pretrained on 20 trillion tokens; further post-trained with curated SFT and RLHF.
  • Proprietary, closed-weights, DashScope-only. Pitched at launch as competitive with DeepSeek-V3, Llama-3.1-405B, and the Qwen2.5-72B open-weights flagship.
  • The first Qwen-Max release; established the proprietary-flagship product slot that Qwen3-Max and Qwen 3.6-Max-Preview would later occupy.
Model
QwQ-32B-Preview
Qwen/QwQ-32B-Preview
Reasoning
Legacy
Nov 28, 2024
First Qwen reasoning model. 32B Apache 2.0. Beat o1-preview on AIME and MATH per Alibaba's testing.
  • Released November 28, 2024; coverage in InfoQ and TechCrunch; HuggingFace card: Qwen/QwQ-32B-Preview.
  • 32B-parameter reasoning model under Apache 2.0; the first Qwen release explicitly positioned against OpenAI's o1 series as an open-weights reasoning alternative.
  • Per Alibaba, beat o1-preview on AIME and MATH benchmarks at launch, demonstrating that 32B parameters could match much larger reasoning models on math / coding tasks.
  • Superseded by the production QwQ-32B (March 2025); status is Legacy.
Model
Qwen2.5-Coder family
Qwen/Qwen2.5-Coder-{0.5B, 1.5B, 3B, 7B, 14B, 32B}-Instruct
Specialized
Available
Nov 12, 2024
Six coding-specialist sizes (0.5B–32B). 32B-Instruct reportedly competitive with GPT-4o on HumanEval. Apache 2.0.
  • Released November 12, 2024 with six sizes (0.5B, 1.5B, 3B, 7B, 14B, 32B). HuggingFace cards on the Qwen org.
  • The flagship Qwen2.5-Coder-32B-Instruct was reportedly competitive with GPT-4o on HumanEval / MBPP at launch — the strongest open-weights coding model of late 2024 alongside DeepSeek-Coder-V2.
  • License: Apache 2.0 across the smaller sizes, with the 32B following the same Qwen License pattern as Qwen2.5-72B.
  • Superseded as the recommended open-weights coding model by Qwen3-Coder (July 2025).
Model
Qwen2.5 family (the “Party”)
Qwen/Qwen2.5-{0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B}, base + instruct
Flagship
Legacy
Sep 19, 2024
“A Party of Foundation Models.” Seven dense sizes. 18T tokens. Most variants Apache 2.0; 3B and 72B under Qwen License.
  • Released September 19, 2024 as “A Party of Foundation Models”; the announcement is at qwenlm.github.io/blog/qwen2.5; technical report at arXiv 2412.15115.
  • Seven dense sizes (0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B) released together — the broadest single-day Qwen release wave.
  • 18 trillion pretraining tokens, ~2.5× Qwen2's corpus, with a focus on knowledge, coding, and mathematics.
  • License: Apache 2.0 for the 0.5B / 1.5B / 7B / 14B / 32B variants; Tongyi-Qianwen License for the 3B and 72B variants.
  • Established Qwen2.5 as the dominant open-weights mid-tier line through late 2024 and most of 2025; superseded as the flagship by Qwen3 in April 2025.
Model
Qwen2-VL
Qwen/Qwen2-VL-{2B, 7B, 72B}-Instruct
Specialized
Legacy
Aug 29, 2024
First Qwen2-era vision-language. Three sizes (2B / 7B / 72B). Variable-resolution visual encoder.
  • Released August 29, 2024; HuggingFace cards on the Qwen org.
  • Three sizes (2B, 7B, 72B) with a dynamic-resolution visual encoder for variable-aspect-ratio image inputs and longer-video understanding.
  • Built on the Qwen2 base; superseded by Qwen2.5-VL (January 2025) and Qwen2.5-Omni (March 2025).
Model
Qwen2-Audio + Qwen2-Math
Qwen/Qwen2-Audio-7B-Instruct, Qwen/Qwen2-Math-{1.5B, 7B, 72B}-Instruct
Specialized
Legacy
Aug 8, 2024
First Qwen2-era audio (speech understanding) and math specialists. Apache 2.0.
  • Released August 8, 2024 as a paired specialized launch.
  • Qwen2-Audio-7B-Instruct — first Qwen audio-understanding model, supporting speech transcription, voice chat, and audio-content QA.
  • Qwen2-Math — three sizes (1.5B, 7B, 72B) targeting STEM-reasoning workloads. The recipe later folded into general-purpose Qwen3 reasoning capability.
  • Both lines under Apache 2.0; superseded by general-purpose Qwen3 / Qwen3.5 instruction tuning by 2025.
Model
Qwen2 family
Qwen/Qwen2-{0.5B, 1.5B, 7B, 57B-A14B, 72B}, base + instruct
Flagship
Legacy
Jun 6, 2024
Five sizes including the first Qwen MoE (57B-A14B). 27 additional languages. 128K context. Apache 2.0 except 72B.
  • Released June 6, 2024; the announcement is at qwenlm.github.io/blog/qwen2; technical report at arXiv 2407.10671.
  • Five sizes — four dense (0.5B, 1.5B, 7B, 72B) and the first production Qwen MoE (Qwen2-57B-A14B, 57B total / 14B active). 128K context across the family.
  • Pretrained on data covering 27 additional languages beyond English and Chinese; Qwen2-7B-Instruct and Qwen2-72B-Instruct extended context to 128K tokens.
  • License turn: Qwen2-0.5B / 1.5B / 7B / 57B-A14B shipped under Apache 2.0, the first Qwen flagship release with broad permissive coverage. Only Qwen2-72B retained the original Tongyi-Qianwen License.
  • The architectural foundation for everything that followed in the Qwen2.x line.

The Apache 2.0 turn — June 2024. Above this line: most of the Qwen line ships under Apache 2.0, with only the largest size in each generation typically retaining the bespoke Tongyi-Qianwen License. Below: the founding lineage — Qwen 1 (August–November 2023) and Qwen 1.5 (February–March 2024) — entirely under the Tongyi-Qianwen License, the bespoke Alibaba license that allowed academic and commercial use with restrictions. The Qwen2 launch was the structural commitment to permissive open-source that made Qwen the dominant HuggingFace-leaderboard line through 2025.

Model
Qwen 1.5 family + Qwen1.5-MoE
Qwen/Qwen1.5-{0.5B, 1.8B, 4B, 7B, 14B, 32B, 72B, 110B, MoE-A2.7B}-Chat
Flagship
Legacy
Feb 2024
Qwen 1.5 and Qwen 1.5-MoE-A2.7B (March 28, 2024). First Qwen MoE. Tongyi-Qianwen License.
  • Qwen 1.5 family released February 2024 across multiple sizes (0.5B, 1.8B, 4B, 7B, 14B, 32B, 72B, 110B); extended context support and improved instruction-following over Qwen 1.
  • Qwen1.5-MoE-A2.7B followed March 28, 2024 as the first Qwen Mixture-of-Experts release — ~14B total parameters with 2.7B active per token.
  • License: bespoke Tongyi-Qianwen License — the same license as Qwen 1; commercial use permitted with restrictions, not OSI-compliant.
  • Superseded as the recommended open-weights line by Qwen2 (June 2024) and the broader Apache 2.0 turn that followed.
Model
Qwen 1 family (1.8B / 7B / 14B / 72B)
Qwen/Qwen-{1.8B, 7B, 14B, 72B}, -Chat variants
Flagship
Legacy
Aug 3, 2023
The lab's debut. Qwen-7B (Aug 3), Qwen-14B (Sep 25), Qwen-1.8B + Qwen-72B (Nov 30). Tongyi-Qianwen License.
  • Qwen-7B released August 3, 2023 on ModelScope and HuggingFace; the debut of the line. Qwen-14B followed September 25, 2023; Qwen-1.8B and Qwen-72B on November 30, 2023.
  • Four sizes total: 1.8B, 7B, 14B, 72B, each with base (`-Base`) and chat (`-Chat`) variants. GitHub: QwenLM/Qwen.
  • Followed Alibaba Cloud's April 2023 Tongyi Qianwen launch — the lab's response to ChatGPT, formally announced four months before the first open-weights release.
  • License: bespoke Tongyi-Qianwen License. Commercial use permitted with restrictions, not OSI-compliant. Established the licensing pattern that later partially shifted to Apache 2.0 at Qwen2.
  • The architectural foundation for everything that followed in the Qwen line through Qwen3.6.

Click any row to expand. Each row has a stable id for sharing — e.g. /ai/qwen/versions/#qwen-3-6-27b, #qwen-3-next, #qwen-3-vl, #qwen-3-coder-next, #qwen-3, #qwen-2-5, #qwq-32b. Qwen blog: qwenlm.github.io/blog; HuggingFace org: huggingface.co/Qwen; GitHub org: github.com/QwenLM; DashScope / Model Studio: alibabacloud.com/help/en/model-studio.

The April 2023 Tongyi Qianwen launch

Alibaba Cloud formally launched Tongyi Qianwen (通义千问, “truth from a thousand questions”) in April 2023 as the company's response to ChatGPT, two months after Baidu's Ernie Bot launched and roughly five months after OpenAI's November 2022 ChatGPT release. The model was first demonstrated by then–Alibaba CEO Daniel Zhang at the Alibaba Cloud Summit on April 11, 2023 and rolled out to enterprise customers through the Tongyi product family on Alibaba Cloud.

The first open-weights release — Qwen-7B — followed on August 3, 2023. The four-month gap between the Tongyi consumer-product launch and the first open-weights release was characteristic of Alibaba's strategy: ship the proprietary chatbot first to enterprise customers via Alibaba Cloud, then open-source the underlying model line to developer communities for ecosystem effects. The same hybrid pattern persists through Qwen3.6: open-weights flagships on HuggingFace alongside proprietary Qwen-Max-line models on DashScope.

The Apache 2.0 turn — from Tongyi-Qianwen License to permissive open-source

Qwen's licensing has evolved across three distinct conventions. The Qwen 1 lineage (Qwen-7B / 14B / 1.8B / 72B, August–November 2023) shipped under the bespoke Tongyi-Qianwen License — an Alibaba-authored license with permissive terms for academic use and commercial use with restrictions, but not OSI-compliant. The Qwen 1.5 family in February 2024 continued the same license pattern.

The licensing turn arrived with Qwen2 on June 6, 2024. Most of the Qwen2 sub-family (0.5B, 1.5B, 7B, 57B-A14B) shipped under Apache 2.0 — the first Qwen flagship release with broad permissive coverage. Only Qwen2-72B retained the Tongyi-Qianwen License. The pattern continued through Qwen2.5 (September 19, 2024), where the 0.5B / 1.5B / 7B / 14B / 32B variants shipped Apache 2.0 with only the 3B and 72B retaining the Qwen License.

Qwen3 (April 28, 2025) was the structural commitment: the entire Qwen3 family — six dense sizes from 0.6B to 32B, plus the 30B-A3B and 235B-A22B MoE variants — shipped Apache 2.0. Every Qwen flagship release since (Qwen3-Coder, Qwen3.5, Qwen3.5-Plus's open-weights variants, Qwen3.6-27B, Qwen-Image-2512) has shipped Apache 2.0. The proprietary Qwen-Max-line (Qwen2.5-Max, Qwen3-Max, Qwen 3.6-Max-Preview, Qwen 3.5-Plus, Qwen 3.6-Plus) runs as a parallel commercial track on DashScope, but the open-weights story has been Apache-2.0-or-permissive across every release since Qwen2.

The Qwen3 hybrid-reasoning era — April 28, 2025

Qwen3 launched on April 28, 2025 as Alibaba's first frontier-AI line with hybrid reasoning architecture. The release shipped eight models simultaneously — six dense (0.6B, 1.7B, 4B, 8B, 14B, 32B) and two MoE (30B-A3B and the flagship 235B-A22B) — all under Apache 2.0, all sharing a single architecture that supports both thinking-mode chain-of-thought reasoning and non-thinking-mode fast responses. The announcement is at qwenlm.github.io/blog/qwen3; coverage in TechCrunch and Alibaba Cloud Community.

The Qwen3 architecture absorbed the standalone QwQ reasoning track into the Flagship V-series. The QwQ line had run for five months — QwQ-32B-Preview in November 2024, the production QwQ-32B in March 2025 — as Alibaba's open-weights answer to OpenAI's o1 series and DeepSeek's R1. Qwen3's hybrid-mode architecture made the standalone Reasoning family redundant; no further QwQ releases have shipped in the year since, and the page's convergence-policy reading (documented in `docs/qwen-versions.md`) is that no further Reasoning rows are expected unless Alibaba revives the standalone track.

Qwen3 was trained on 36 trillion tokens — double Qwen2.5's pretraining corpus — with native multilingual support across 119 languages and dialects, the broadest language coverage of any frontier-AI line at the time. The Qwen3.5 family released ten months later (February 2026) extended that to 201 languages and added native multimodality across text + image + video; Qwen3.6-27B (April 2026) added the Gated DeltaNet hybrid architecture and Thinking Preservation. The Qwen3 architecture and its descendants remain the load-bearing recipe for everything Alibaba has shipped since.

The HuggingFace-leaderboard dominance

Across late 2024 and most of 2025, Qwen-derived models held a majority of the top-trending positions on the HuggingFace Open LLM Leaderboard and the various community-derived benchmark trackers. The pattern was driven by a combination of three factors: the Apache 2.0 license (which let community fine-tuners commercially redistribute their derivatives), the breadth of base sizes (0.5B through 235B in the Qwen3 family alone), and the strength of Qwen2.5-Coder / QwQ / Qwen3 on math + coding benchmarks specifically (which weighted heavily in the leaderboard rankings). At various points in 2025, more than half of the top 20 trending HuggingFace models were Qwen-derived. The dominance shifted slightly in 2026 as DeepSeek-V4 / Llama 5-equivalents / Mistral 3 absorbed share, but Qwen-derivatives have remained a large fraction of the top-trending HuggingFace models through April 2026.

The U.S. chip export-control context

Qwen training has been constrained by the same U.S. chip export-control regime that shapes DeepSeek's training environment. The October 7, 2022 Department of Commerce export controls restricted top-tier Nvidia AI-GPU exports to China; Alibaba Cloud was already on various U.S. Entity List adjacencies prior to that, with subsequent expansions through 2023–2025 tightening the procurement environment. Like DeepSeek, Alibaba Cloud built Qwen's training infrastructure on a mix of Nvidia H800 chips procured during the gap before the October 2023 H800 ban, and on Chinese domestic alternatives (Huawei Ascend, Cambricon).

The empirical record — trillion-parameter Qwen3-Max trained on 36T tokens, Qwen3.6-27B with state-of-the-art coding benchmarks — demonstrates that the Qwen team has continued training at frontier scale despite the controls, and like DeepSeek has been the subject of Department of Commerce inquiries about the chip procurement that supported specific runs. Coverage in CSIS and South China Morning Post covers the broader policy environment.

Where to run Qwen

Qwen is among the most widely-deployed AI lines because the open-weights releases are Apache 2.0 across nearly every size and the proprietary releases are available through Alibaba Cloud's Model Studio with OpenAI- and Anthropic-compatible APIs. Inference paths through 2025–2026 break into four categories.

Alibaba Cloud first-party. Qwen Chat is the consumer chat surface. Model Studio (formerly DashScope) is the developer API endpoint, OpenAI-API-compatible and serving both the open-weights and the proprietary Max-line models. The proprietary Max-line is exclusive to this surface.

Self-host from HuggingFace. Download from the Qwen org and run with vLLM, SGLang, llama.cpp, or Ollama. The open-weights flagships (Qwen3.6-27B, Qwen3.5 family, Qwen3 family, QwQ-32B) self-host without commercial restriction; the small minority of Qwen License variants (Qwen2-72B, Qwen2.5-72B, Qwen2.5-3B) require attestation of the bespoke license terms.

Hyperscalers. AWS Bedrock and Azure AI Foundry have added Qwen SKUs across 2025–2026; ModelScope (Alibaba's own model-hub) hosts the broadest set. NVIDIA NIM has Qwen variants for the most-served sizes.

Hosted-inference providers. Together AI, Fireworks, OpenRouter, SiliconFlow, Groq. Most providers serve the Apache 2.0 lineage with similar latency / cost characteristics; the Tongyi-Qianwen-License variants (mostly the 72B and 3B sizes) are typically not carried by Western inference providers due to license-attestation overhead.

People who shaped Qwen

The Qwen / Tongyi Lab team is structured inside Alibaba Cloud rather than as a standalone lab. The team has been led at the operational level by Junyang Lin (Tongyi Lab), with broader Alibaba Cloud AI strategy under CTO Jingren Zhou. The team has been notable for its consistency through Alibaba's broader 2024 corporate-restructuring waves — the Qwen release cadence accelerated rather than slowed across the period in which Alibaba was reorganizing its other business units.

Eddie Wu — CEO of Alibaba Group since September 2023; has publicly framed AI as Alibaba's strategic priority above e-commerce / cloud / logistics, with multi-year capex commitments to Qwen training infrastructure. Joseph Tsai — Chairman of Alibaba Group; has been the public spokesperson for Alibaba's AI strategy in international forums (the Davos annual meetings, the Bloomberg Tech Summit).

No publicly-named Qwen CTO or founding team in the Western-lab sense. Unlike OpenAI / Anthropic / Mistral / xAI / DeepSeek, Qwen is not structured as a startup-style lab with named founders and a public-facing leadership roster. The team operates under Alibaba Cloud's organizational umbrella, and the per-paper author lists on the Qwen / Qwen2 / Qwen2.5 / Qwen3 technical reports are the closest available roster of named contributors.

The competitive landscape

Qwen is, alongside DeepSeek, one of the two dominant Chinese open-weights AI families through 2024–2026. The closest direct comparators on the open-weights axis are DeepSeek (Chinese, MIT-licensed for the V3 / R1 line and onward, the December 2024 / January 2025 inflection — see DeepSeek Versions), Mistral (French; Apache 2.0 for the Mistral 3 family with a parallel proprietary tier — see Mistral Versions), Meta's Llama (custom Llama Community License, see Llama Versions), and the other Chinese frontier labs (Baidu Ernie, Zhipu GLM, MiniMax, Moonshot Kimi). The closed-weights frontier competitors — ChatGPT, Claude, Gemini, Grok — are the practical benchmark for “is Qwen competitive at frontier scale,” which the Qwen3 / Qwen3-Max / Qwen3.5 / Qwen3.6 release cycle has been answering in the affirmative since April 2025. Qwen's distinguishing variable is the breadth of its specialized track (Coder / VL / Audio / Math / Omni / Image) and the consistency of its Apache 2.0 commitment on the open-weights flagships, both of which continue to underwrite the line's HuggingFace-leaderboard dominance. This page does not attempt a benchmark roundup or a ranking.

Use Qwen

The browser cannot detect which Qwen model you've used — there's no fingerprint or header that exposes it. The block below carries the practical information instead: the current model identifiers, a copy-paste API call, the surfaces where Qwen is available, and the licensing summary.

Current model identifiers

DashScope / Model Studio model strings on the left; HuggingFace ids on the Qwen org on the right. Verify against alibabacloud.com/help/en/model-studio and huggingface.co/Qwen for the freshest list.

# Open-weights flagship line (Apache 2.0)
Qwen/Qwen3.6-27B
Qwen/Qwen3.6-35B-A3B            # MoE companion to the dense 27B
Qwen/Qwen3.5-{27B, 35B-A3B, 122B-A10B, 397B}
Qwen/Qwen3-Next-80B-A3B-Instruct  # ultra-sparse MoE, 3B active of 80B
Qwen/Qwen3-Next-80B-A3B-Thinking

# Open-weights specialized (Apache 2.0)
Qwen/Qwen3-Coder-Next           # coding agent, 80B/3B active
Qwen/Qwen3-Coder-{30B-A3B, 480B-A35B}-Instruct
Qwen/Qwen3-VL-{2B, 4B, 8B, 32B, 30B-A3B, 235B-A22B}-Instruct
Qwen/Qwen2.5-Omni-7B           # end-to-end multimodal
Qwen/QwQ-32B                    # reasoning
Qwen/Qwen-Image-2512            # text-to-image

# Proprietary DashScope model strings (per docs: dots, not dashes)
qwen3.7-max                     # current Max-line flagship (May 20, 2026)
qwen3.6-max-preview
qwen3.6-plus
qwen3.6-flash
qwen3.5-plus
qwen3-max
qwen-max-2025-01-25             # Qwen2.5-Max alias

Quick API call (OpenAI-compatible)

DashScope is OpenAI-API-compatible — point any OpenAI SDK at the Alibaba Cloud Model Studio base URL with a DashScope API key. Replace the placeholder values before running.

$ curl https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model":    "qwen3.7-max",
      "messages": [{ "role": "user", "content": "Hello, Qwen." }]
    }'

Where to run Qwen

Four categories — Alibaba Cloud first-party, self-host from HuggingFace, hyperscalers, and hosted-inference providers. Pricing varies by provider; the open weights are the same across all of them.

# Alibaba Cloud first-party
https://qwen.ai/                            # Qwen Chat consumer chat
https://www.alibabacloud.com/help/en/model-studio/   # OpenAI-compatible API

# Self-host from HuggingFace
https://huggingface.co/Qwen                 # every model card lives here
https://github.com/QwenLM                   # GitHub org, technical READMEs
https://github.com/vllm-project/vllm        # production-grade throughput
https://ollama.com/                         # single-binary, easiest entry

# Hyperscalers
AWS Bedrock, Azure AI Foundry, ModelScope, NVIDIA NIM

# Hosted-inference providers
https://www.together.ai/
https://fireworks.ai/
https://openrouter.ai/
https://groq.com/
https://www.siliconflow.com/

Licensing

Three license tiers across the line. Read the model card on HuggingFace for the relevant model before shipping at scale.

# Apache 2.0 — commercial use unrestricted
Qwen3.6-27B, Qwen3.5 (most sizes), Qwen3 (entire family)
Qwen3-Coder-Next, Qwen3-Next-80B-A3B (Instruct + Thinking)
Qwen3-VL (all sizes), Qwen3-Coder, Qwen-Image-2512
QwQ-32B, QwQ-32B-Preview
Qwen2.5-Omni, Qwen2.5-VL (smaller sizes), Qwen2.5-Coder (smaller sizes)
Qwen2.5 (most sizes), Qwen2 (most sizes)

# Tongyi-Qianwen License — commercial use with restrictions
Qwen2-72B, Qwen2.5-72B, Qwen2.5-3B
Qwen 1 family (1.8B / 7B / 14B / 72B)
Qwen 1.5 family

# Proprietary — closed-weights, DashScope only
Qwen 3.6-Max-Preview, Qwen 3.6-Plus
Qwen3.5-Plus, Qwen3-Max
Qwen2.5-Max

Sources: Qwen blog; github.com/QwenLM; huggingface.co/Qwen; Alibaba Cloud Model Studio docs; research papers on arXiv (Qwen2, Qwen2.5, Qwen3 technical reports); contemporaneous reporting in NYT, FT, Bloomberg, CNBC, South China Morning Post, TechCrunch, VentureBeat, MarkTechPost, Simon Willison. Last updated June 10, 2026.

Mungomash LLC · More AI pages

Last refreshed 2026-06-10 by Triton — added Qwen3-Omni row (Sep 22, 2025).