All models

1 of 55 open-source models (filtered).

Sort:

Hybrid Mamba-Transformer-MoE model with native 256K context (effective beyond 140K). 94B active parameters out of 398B total. The state-space-model layers give it linear-time scaling with sequence length, making it interesting for very long contexts. Licensed under AI21's open model licence, which permits most commercial use.

Context: 256K
License: jamba-open
VRAM Q4: 238.8 GB