OSAIM
Open Source AI Models

All models

4 of 40 open-source models (filtered).

Mixtral 8×22B Instruct
141B

Scaled-up Mixtral with 22B-parameter experts. ~39B active parameters out of 141B total. Strong long-context performance and competitive coding scores. Apache 2.0 makes it attractive for self-hosting where the licence terms of Llama 3 are a non-starter.

Context
66K
License
apache-2-0
VRAM Q4
84.6 GB
Mixtral 8×7B Instruct
46.7B

The mixture-of-experts release that introduced 8 experts of 7B each, 2 active per token. ~13B active parameters with 47B total, which makes per-token inference roughly as fast as a 13B dense model while approaching 70B dense quality. Apache 2.0 weights mean it's still a popular self-hosting choice. Memory footprint is the main constraint — the full 47B parameters must be loaded even though only a quarter are active per token.

Context
33K
License
apache-2-0
VRAM Q4
28 GB
Mistral Small 3
24B

24B dense model from Mistral's January 2025 release that competes with Llama 3.3 70B on many tasks at a third of the parameter count. Apache 2.0 licensed and small enough to run on a single 4090 at Q4. Good pick when you want Llama-3.3-70B-class chat quality but at a friendlier hardware budget, or when the licence matters and Llama's community terms don't fit.

Context
33K
License
apache-2-0
VRAM Q4
14.4 GB
Mistral Nemo 12B
12B

Joint Mistral × NVIDIA model with 128K context, designed as a drop-in upgrade to Mistral 7B. Trained with NVIDIA's Megatron stack and released under Apache 2.0. Strong multilingual coverage thanks to the Tekken tokenizer.

Context
128K
License
apache-2-0
VRAM Q4
7.2 GB