Google DeepMind

Gemma

Google's open-weights model family derived from the same research as Gemini. Strong performance at small scales.

History & context

Google DeepMind's Gemma family is open-weights derived from the same research lineage as Gemini. Gemma 1 (February 2024) was the introduction; Gemma 2 (June 2024) delivered the major quality jump — particularly at 9B and 27B, where logit-distillation from a larger teacher closed much of the gap with closed frontiers.

The Gemma Terms of Use permit commercial use subject to Google's Prohibited Use Policy. The licence is source-available rather than OSI open source: there are field-of-use restrictions, but no monthly-active-user cap, which makes Gemma the preferred choice for some commercial users who can't meet Llama's terms.

Gemma 2 2B is one of the strongest on-device models in its size class, particularly when distilled-from-larger-teacher quality matters.

Flagship model

Gemma 2 27B

27B

Flagship Gemma 2 release. Uses logit-distillation from a larger teacher model, which is how Google delivers near-70B quality from a 27B student. A solid choice when the Llama community licence doesn't fit and you need quality at the 27B–40B size range.

Context: 8K
License: gemma
VRAM Q4: 16.2 GB

3 models in this family

Gemma 2 27B

27B

Context: 8K
License: gemma
VRAM Q4: 16.2 GB

Gemma 2 9B

Mid-tier Gemma. Strong general-purpose chat model at small scale. The Gemma Terms of Use permit commercial use subject to Google's prohibited-use policy.

Context: 8K
License: gemma
VRAM Q4: 5.4 GB

Gemma 2 2B

2.6B

Compact Gemma variant designed for on-device inference. Trained with knowledge distillation from larger Gemma 2 teachers. Runs comfortably on a phone at Q4.

Context: 8K
License: gemma
VRAM Q4: 1.6 GB

Comparing Gemma against another family? Try the side-by-side comparator or browse all leaderboards.