Alibaba
Qwen
Alibaba's Qwen series. Qwen2.5 delivers competitive frontier-class performance across sizes from 0.5B to 72B with permissive licensing.
Visit homepage ↗History & context
Alibaba's Qwen ('通义千问' / Tongyi Qianwen) family quickly became the strongest non-Meta open-weights family. Qwen2 (June 2024) delivered competitive quality at the 7B and 72B sizes. Qwen2.5 (September 2024) was the breakthrough — across sizes from 0.5B to 72B, with specialised variants for code and math.
Qwen2.5 sub-72B sizes are released under Apache 2.0 with no usage cap, which makes them the most permissively-licensed strong-quality models available. The 72B uses the Qwen License (100M MAU cap, similar to Llama's pattern).
Qwen2.5 Coder 32B (November 2024) reached GPT-4o-class HumanEval scores at a size that fits on a single H100. QwQ 32B Preview (also November 2024) was Alibaba's first reasoning-focused 'thinking' model.
Flagship model
6 models in this family
The flagship Qwen 2.5 release. Competes with Llama 3.1 405B on many benchmarks at one-fifth the parameter count. Note the 72B specifically uses the Qwen License (commercial use up to 100M MAU) — the smaller Qwen2.5 sizes are Apache 2.0.
- Context
- 128K
- License
- qwen
- VRAM Q4
- 43.2 GB
32B sweet-spot model: strong reasoning, fits on one H100 in fp16, on a 4090 at Q4. The 32B size in particular hits a quality/cost knee — quality scales with parameters faster than cost up to ~32B, and slower afterwards. Favoured for production chat where 7B isn't sharp enough and where 70B+ would over-spec the hardware budget. Apache 2.0 licence.
- Context
- 128K
- License
- apache-2-0
- VRAM Q4
- 19.2 GB
Coding-specialised Qwen2.5 32B fine-tune. GPT-4o-class on HumanEval and BigCodeBench at the time of release. Trained on additional code-heavy data with extended pre-training. Apache 2.0. Natural pick for self-hosted coding assistants, code-review automation, and any agent loop that primarily writes code.
- Context
- 128K
- License
- apache-2-0
- VRAM Q4
- 19.2 GB
Qwen's reasoning-focused 'thinking' model. Generates long chains-of-thought before answering, similar to OpenAI's o1 and DeepSeek R1 lineage. Optimised for math and competition-style problem solving. The Preview tag means Qwen is iterating quickly; later versions may obsolete this one. Useful today for math-heavy workloads where a slow, careful answer is preferred to a fast wrong one.
- Context
- 33K
- License
- apache-2-0
- VRAM Q4
- 19.2 GB
Mid-size Qwen2.5 with broad task coverage. The sweet spot for users who want noticeably better quality than 7B but can't justify the hardware footprint of 32B or 72B.
- Context
- 128K
- License
- apache-2-0
- VRAM Q4
- 8.4 GB
Apache-2.0-licensed 7B model with surprisingly strong reasoning and multilingual chops. Qwen 2.5 trains on a larger and more carefully filtered corpus than the original Qwen series, and the 7B variant punches well above its weight on coding and math benchmarks. A strong default for cost-sensitive chat workloads and for fine-tuning experiments where the Apache licence simplifies downstream redistribution.
- Context
- 128K
- License
- apache-2-0
- VRAM Q4
- 4.2 GB