Alibaba

Qwen

Alibaba's Qwen series. Qwen2.5 delivers competitive frontier-class performance across sizes from 0.5B to 72B with permissive licensing.

Visit homepage ↗

History & context

Alibaba's Qwen ('通义千问' / Tongyi Qianwen) family quickly became the strongest non-Meta open-weights family. Qwen2 (June 2024) delivered competitive quality at the 7B and 72B sizes. Qwen2.5 (September 2024) was the breakthrough — across sizes from 0.5B to 72B, with specialised variants for code and math.

Qwen2.5 sub-72B sizes are released under Apache 2.0 with no usage cap, which makes them the most permissively-licensed strong-quality models available. The 72B uses the Qwen License (100M MAU cap, similar to Llama's pattern).

Qwen2.5 Coder 32B (November 2024) reached GPT-4o-class HumanEval scores at a size that fits on a single H100. QwQ 32B Preview (also November 2024) was Alibaba's first reasoning-focused 'thinking' model.

Flagship model

Qwen 3 235B (A22B)

235B

The flagship Qwen 3 release: a 235B-total MoE with 22B active parameters per token. Competitive with DeepSeek V3 and Llama 4 Maverick on reasoning benchmarks while being smaller total. Apache 2.0 — one of the most permissively licenced frontier-class models.

Context: 128K
License: apache-2-0
VRAM Q4: 141 GB

9 models in this family

Qwen 3 235B (A22B)

235B

Context: 128K
License: apache-2-0
VRAM Q4: 141 GB

Qwen2.5 72B Instruct

72B

The flagship Qwen 2.5 release. Competes with Llama 3.1 405B on many benchmarks at one-fifth the parameter count. Note the 72B specifically uses the Qwen License (commercial use up to 100M MAU) — the smaller Qwen2.5 sizes are Apache 2.0.

Context: 128K
License: qwen
VRAM Q4: 43.2 GB

Qwen2.5 Coder 32B

32B

Coding-specialised Qwen2.5 32B fine-tune. GPT-4o-class on HumanEval and BigCodeBench at the time of release. Trained on additional code-heavy data with extended pre-training. Apache 2.0. Natural pick for self-hosted coding assistants, code-review automation, and any agent loop that primarily writes code.

Context: 128K
License: apache-2-0
VRAM Q4: 19.2 GB

Qwen2.5 32B Instruct

32B

32B sweet-spot model: strong reasoning, fits on one H100 in fp16, on a 4090 at Q4. The 32B size in particular hits a quality/cost knee — quality scales with parameters faster than cost up to ~32B, and slower afterwards. Favoured for production chat where 7B isn't sharp enough and where 70B+ would over-spec the hardware budget. Apache 2.0 licence.

Context: 128K
License: apache-2-0
VRAM Q4: 19.2 GB

QwQ 32B Preview

32B

Qwen's reasoning-focused 'thinking' model. Generates long chains-of-thought before answering, similar to OpenAI's o1 and DeepSeek R1 lineage. Optimised for math and competition-style problem solving. The Preview tag means Qwen is iterating quickly; later versions may obsolete this one. Useful today for math-heavy workloads where a slow, careful answer is preferred to a fast wrong one.

Context: 33K
License: apache-2-0
VRAM Q4: 19.2 GB

Qwen 3 32B

32B

32B sweet-spot Qwen 3, Apache 2.0. Reasoning-mode toggle inherited from smaller siblings; strong on math, code and agentic tool use. Fits on a single H100 in fp16 and on a 4090 at Q4.

Context: 33K
License: apache-2-0
VRAM Q4: 19.2 GB

Qwen2.5 14B Instruct

14B

Mid-size Qwen2.5 with broad task coverage. The sweet spot for users who want noticeably better quality than 7B but can't justify the hardware footprint of 32B or 72B.

Context: 128K
License: apache-2-0
VRAM Q4: 8.4 GB

Qwen 3 8B

The April 2025 refresh of Qwen at 8B. Native mixed-mode reasoning: the model can 'think' before answering when triggered, or answer directly for simple queries — configurable per request. Apache 2.0. A strong upgrade over Qwen 2.5 7B on math and code, with much better instruction following.

Context: 33K
License: apache-2-0
VRAM Q4: 4.8 GB

Qwen2.5 7B Instruct

Apache-2.0-licensed 7B model with surprisingly strong reasoning and multilingual chops. Qwen 2.5 trains on a larger and more carefully filtered corpus than the original Qwen series, and the 7B variant punches well above its weight on coding and math benchmarks. A strong default for cost-sensitive chat workloads and for fine-tuning experiments where the Apache licence simplifies downstream redistribution.

Context: 128K
License: apache-2-0
VRAM Q4: 4.2 GB

Comparing Qwen against another family? Try the side-by-side comparator or browse all leaderboards.