Search

Type a model name, family, license, or keyword.

Models (7)

Coding-focused MoE model with 21B active parameters out of 236B total. Supports 338 programming languages with strong performance across mainstream stacks (Python, TypeScript, Go, Rust, Java, C++) and competent results on niche languages where most open models falter. The DeepSeek licence applies — commercial use permitted with some application restrictions.

Context: 128K
License: deepseek
VRAM Q4: 141.6 GB

Qwen2.5 Coder 32B

32B

Coding-specialised Qwen2.5 32B fine-tune. GPT-4o-class on HumanEval and BigCodeBench at the time of release. Trained on additional code-heavy data with extended pre-training. Apache 2.0. Natural pick for self-hosted coding assistants, code-review automation, and any agent loop that primarily writes code.

Context: 128K
License: apache-2-0
VRAM Q4: 19.2 GB

DeepSeek R1

671B

Reasoning model trained with reinforcement learning on top of DeepSeek V3-Base. MIT licence — even the weights are unrestricted, making R1 the most permissively-licensed frontier reasoning model. Generates long internal chains-of-thought before answering, trading latency for accuracy on math, code, and reasoning benchmarks. Distilled variants (e.g. R1 Distill Llama 70B) recover most of the quality at much smaller scales.

Context: 128K
License: mit
VRAM Q4: 402.6 GB

Yi VL 34B

34B

Vision-language variant of Yi 34B. Image-text reasoning via an MLP adapter on a CLIP encoder. Useful for bilingual EN/中 multimodal workloads where the major Western vision-language models underperform on Chinese text in images.

Context: 4K
License: apache-2-0
VRAM Q4: 20.4 GB

Qwen 3 32B

32B

32B sweet-spot Qwen 3, Apache 2.0. Reasoning-mode toggle inherited from smaller siblings; strong on math, code and agentic tool use. Fits on a single H100 in fp16 and on a 4090 at Q4.

Context: 33K
License: apache-2-0
VRAM Q4: 19.2 GB

Qwen 3 8B

The April 2025 refresh of Qwen at 8B. Native mixed-mode reasoning: the model can 'think' before answering when triggered, or answer directly for simple queries — configurable per request. Apache 2.0. A strong upgrade over Qwen 2.5 7B on math and code, with much better instruction following.

Context: 33K
License: apache-2-0
VRAM Q4: 4.8 GB

OLMo 2 7B

Fully-open 7B model: weights, training data and code all released under permissive licences. Useful as a reference for reproducibility research and for teams that need full transparency on training data provenance.

Context: 4K
License: apache-2-0
VRAM Q4: 4.2 GB

Tip: search is over name, slug, family, license and description. For structured filters (family + license + size), use the all-models page.