OSAIM
Open Source AI Models

Comparison

DeepSeek V3 vs DeepSeek R1

Side-by-side specs, benchmarks and hosted-inference pricing.

Side A
DeepSeek V3
DeepSeek · DeepSeek

671B-parameter MoE model with 37B active per token. Trained for roughly $5.6M of compute — a landmark in cost-efficient frontier training. Frontier-class quality at a fraction of the cost of the closed proprietary frontier. The DeepSeek licence permits commercial use with limited restrictions on military and unlawful applications. Running V3 yourself requires serious hardware (8× H100 at fp8); most teams will use it via the DeepSeek API or providers like Together.

Side B
DeepSeek R1
DeepSeek · DeepSeek

Reasoning model trained with reinforcement learning on top of DeepSeek V3-Base. MIT licence — even the weights are unrestricted, making R1 the most permissively-licensed frontier reasoning model. Generates long internal chains-of-thought before answering, trading latency for accuracy on math, code, and reasoning benchmarks. Distilled variants (e.g. R1 Distill Llama 70B) recover most of the quality at much smaller scales.

Specs

Parameters671B671B
Context length128K128K
Modalitytexttext
Released2024-12-262025-01-20
LicenseDeepSeek LicenseMIT
Commercial useYesYes
VRAM fp161342 GB1342 GB
VRAM Q4402.6 GB402.6 GB

Benchmarks

GPQA71.5
HumanEval82.6
MATH84.097.3
MMLU88.590.8
MMLU-Pro75.984.0

Cheapest hosted pricing

DeepSeek V3
deepinfra: $0.49 in / $0.89 out per 1M tokens
DeepSeek R1
deepinfra: $0.55 in / $2.19 out per 1M tokens
Highlighted cells indicate the better value for that row (higher score, larger context, lower VRAM).