- Tools
- LLM leaderboard
LLM leaderboard
Compare large language models for performance, price and more, to find the best match for your needs.
Leaderboard
Largest context
- Gemini 2.5 Pro
- Gemini 2.0 Flash
- Gemini 2.0 Flash-Lite
Highest output tokens
- o1-pro
- o1
- o3-mini
Least expensive
- R1 Distill LLama 8B
- Ministral 3B
- Gemini 1.5 Flash-8B
Model comparison
Model | Input price / 1M tokens | Output price / 1M tokens | Context window | Output token limit | Reasoning model | Open source |
|---|---|---|---|---|---|---|
Gemini 1.5 Flash-8B | $0.04 | $0.15 | 1000000 | 8192 | ||
Ministral 3B | $0.04 | $0.04 | 128000 | 4096 | ||
R1 Distill LLama 8B | $0.04 | $0.04 | 128000 | 8000 | ||
Qwen Turbo | $0.05 | $0.20 | 1000000 | 8192 | ||
GPT-5 Nano | $0.05 | $0.40 | 128000 | 16384 | ||
Coder V2 Lite | $0.06 | $0.18 | 128000 | 8000 | ||
Gemini 2.0 Flash-Lite | $0.07 | $0.30 | 1000000 | 8192 | ||
Gemini 1.5 Flash | $0.07 | $0.30 | 1000000 | 8192 | ||
Gemini 2.0 Flash | $0.10 | $0.40 | 1000000 | 8192 | ||
Llama 3.1 8B | $0.10 | $0.10 | 128000 | 2048 | ||
Ministral 8B | $0.10 | $0.10 | 128000 | 4096 | ||
GPT-4.1 Nano | $0.10 | $0.40 | 128000 | 16384 | ||
Gemma 2 9B | $0.12 | $0.15 | 8000 | 8192 | ||
Coder V2 | $0.14 | $0.28 | 128000 | 8000 | ||
GPT-4o mini | $0.15 | $0.60 | 128000 | 16384 | ||
GPT-4o mini Audio | $0.15 | $0.60 | 128000 | 16384 | ||
Gemma 2 27B | $0.17 | $0.51 | 8000 | 8192 | ||
Mistral Saba | $0.20 | $0.60 | 32000 | 4096 | ||
Grok 4 Fast | $0.20 | $0.50 | 128000 | 8192 | ||
Claude 3 Haiku | $0.25 | $1.25 | 200000 | 4096 | ||
GPT-5 Mini | $0.25 | $2.00 | 128000 | 16384 | ||
V3 | $0.27 | $1.10 | 128000 | 8000 | ||
Codestral | $0.30 | $0.90 | 128000 | 4096 | ||
R1 Distill Qwen 32B | $0.30 | $0.30 | 128000 | 8000 | ||
Gemini 2.5 Flash | $0.30 | $2.50 | 1000000 | 64000 | ||
Grok 3 Mini | $0.30 | $0.50 | 128000 | 8192 | ||
GPT-4.1 Mini | $0.40 | $1.60 | 128000 | 16384 | ||
GPT-3.5 Turbo | $0.50 | $1.50 | 16385 | 4096 | ||
Llama 2 Chat | $0.50 | $0.25 | 4096 | 2048 | ||
QwQ 32B | $0.55 | $0.75 | 131000 | 8192 | ||
DeepSeek Reasoner | $0.55 | $2.19 | 64000 | 8000 | ||
Llama 3.3 70B | $0.59 | $0.70 | 128000 | 2048 | ||
GPT-4o mini Realtime | $0.60 | $2.40 | 128000 | 4096 | ||
Llama 3.2 | $0.60 | $0.60 | 128000 | 2048 | ||
R1 Distill Llama 70B | $0.72 | $0.99 | 128000 | 8000 | ||
Claude 3.5 Haiku | $0.80 | $4.00 | 200000 | 8192 | ||
Qwen 2.5 Coder 32B | $0.80 | $0.80 | 131000 | 8192 | ||
R1 Distill Qwen 14B | $0.88 | $0.88 | 128000 | 8000 | ||
Sonar Reasoning | $1.00 | $5.00 | 127000 | N/A | ||
Sonar | $1.00 | $1.00 | 127000 | N/A | ||
Claude 4.5 Haiku | $1.00 | $5.00 | 200000 | 8192 | ||
o3-mini | $1.10 | $4.40 | 200000 | 100000 | ||
o1-mini | $1.10 | $4.40 | 128000 | 65536 | ||
o4-mini | $1.10 | $4.40 | 200000 | 65536 | ||
Gemini 2.5 Pro | $1.25 | $10.00 | 2000000 | 64000 | ||
GPT-5 | $1.25 | $10.00 | 128000 | 16384 | ||
GPT-5.1 | $1.25 | $10.00 | 128000 | 16384 | ||
GPT-5.1 Codex | $1.25 | $10.00 | 128000 | 16384 | ||
Qwen 2.5 Max | $1.60 | $6.40 | 32000 | 8192 | ||
Mistral Large | $2.00 | $6.00 | 128000 | 4096 | ||
Pixtral Large | $2.00 | $6.00 | 128000 | 4096 | ||
Sonar Reasoning Pro | $2.00 | $8.00 | 128000 | N/A | ||
Sonar Deep Research | $2.00 | $8.00 | 200000 | N/A | ||
Gemini 3 Pro | $2.00 | $12.00 | 1000000 | 64000 | ||
GPT-4.1 | $2.00 | $8.00 | 128000 | 16384 | ||
GPT-4o | $2.50 | $10.00 | 128000 | 16384 | ||
GPT-4o Audio | $2.50 | $10.00 | 128000 | 16384 | ||
Claude 3.7 Sonnet | $3.00 | $15.00 | 200000 | 8192 | ||
Claude 3.5 Sonnet | $3.00 | $15.00 | 200000 | 8192 | ||
Sonar Pro | $3.00 | $15.00 | 200000 | N/A | ||
Claude Sonnet 4.5 | $3.00 | $15.00 | 200000 | 8192 | ||
Grok 3 | $3.00 | $15.00 | 128000 | 8192 | ||
Grok 4 | $3.00 | $15.00 | 128000 | 8192 | ||
Llama 3.1 405B | $3.50 | $3.50 | 128000 | 2048 | ||
GPT-4o Realtime | $5.00 | $20.00 | 128000 | 4096 | ||
GPT-4 Turbo | $10.00 | $30.00 | 128000 | 4096 | ||
o3 | $10.00 | $40.00 | 200000 | 100000 | ||
o3 Deep Research | $10.00 | $40.00 | 200000 | 100000 | ||
o1 | $15.00 | $60.00 | 200000 | 100000 | ||
Claude 3 Opus | $15.00 | $75.00 | 200000 | 4096 | ||
Claude Opus 4 | $15.00 | $75.00 | 200000 | 4096 | ||
GPT-5 Pro | $15.00 | $120.00 | 128000 | 16384 | ||
o3 Pro | $20.00 | $80.00 | 200000 | 100000 | ||
GPT-4 | $30.00 | $60.00 | 8192 | 8192 | ||
GPT-4.5 | $75.00 | $150.00 | 128000 | 16384 | ||
o1-pro | $150.00 | $600.00 | 200000 | 100000 | ||
Gemma 3 1B | N/A | N/A | 32000 | 8192 | ||
Gemma 3 27B | N/A | N/A | 128000 | 8192 | ||
Qwen 2.5 72B | N/A | N/A | 131000 | 8192 |
Key definitions
Price:Price per token refers to the cost of processing each token in the prompt sent to an LLM, while output price per token is the cost of each token generated by the model in response. The price shown in the leaderboard section is a blended price, using a typical ratio of 3:1 of input to output usage. Some models have a price of 0, which can be in the case of a limited free trial.
Context window:The maximum amount of text (tokens) the model can process at once, including both input and generated output. It determines how much prior conversation or document history the model can "remember" within a single interaction.
Output token limit:Maximum output tokens define the upper limit of tokens an LLM can generate in a single response. This limit is influenced by the model's context window and provider policies, dictating the length of its output.
Reasoning model:A reasoning LLM signifies a model capable of going beyond pattern recognition to perform logical inference and problem-solving. This involves tasks like complex mathematics, planning, and generating "chain of thought" explanations, mimicking human-like cognitive processes. Essentially, it aims to understand and solve problems, not just reproduce text.
Open source:Some LLMs are published under an open-source license, allowing developers to access and modify the code, and this means you are also able to host these models yourself on premises or in the cloud. Others, such as Mistral, are available for self-hosting under licence.