Ollama Model Selection Guide: Best Bang for Buck by Task Type¶
Optimizing for: Model Size → Effectiveness
Smaller models ranked higher when performance is competitive
Quick Reference: Top Picks by Category¶
| Task Type | 🥇 Best Value | 🥈 Mid-Tier | 🏆 Best Performance |
|---|---|---|---|
| Thinking/Reasoning | DeepSeek-R1 (8B distill) | QwQ (32B) | DeepSeek-R1 (671B) |
| Coding | DeepCoder (1.5B/14B) | RNJ-1 (8B) | Qwen3-Coder (30B) |
| Vision | MiniCPM-V (8B) | Qwen3-VL (4B) | Qwen3-VL (32B/235B) |
| Embedding | Nomic-Embed-Text (~137M) | EmbeddingGemma (300M) | Nomic-Embed-V2-MoE (305M active) |
| Tools/Agentic | Ministral 3 (3B) | Mistral Small 3.2 (24B) | Qwen3 (30B-A3B) |
| General Purpose | Gemma3n (e2b) | Gemma3 (4B) | Qwen3 (235B) |
🧠 Thinking / Reasoning Models¶
Chain-of-thought, math, logic, complex problem solving
| Model | Size | Active Params | Key Strengths | Value Rating |
|---|---|---|---|---|
| DeepSeek-R1:8b | 8B | 8B | Distilled from 671B, excellent reasoning | ⭐⭐⭐⭐⭐ |
| DeepSeek-R1:1.5b | 1.5B | 1.5B | Tiny but capable for basic reasoning | ⭐⭐⭐⭐ |
| Nemotron-3-Nano | 30B | 3.5B | Hybrid MoE, configurable reasoning | ⭐⭐⭐⭐ |
| QwQ | 32B | 32B | Competitive with o1-mini, DeepSeek-R1 | ⭐⭐⭐ |
| Olmo 3.1 Think | 32B | 32B | Open weights/data, MATH 96.2% | ⭐⭐⭐ |
| DeepSeek-R1:32b | 32B | 32B | Distilled, strong benchmarks | ⭐⭐⭐ |
| Qwen3:30b | 30B | 3B | MoE, thinking mode available | ⭐⭐⭐ |
| DeepSeek-R1:70b | 70B | 70B | Near full-model performance | ⭐⭐ |
| DeepSeek-R1:671b | 671B | 671B | SOTA, approaching O3/Gemini 2.5 Pro | ⭐ |
Recommendation: Start with deepseek-r1:8b - it's the sweet spot for reasoning on limited hardware.
💻 Coding Models¶
Code generation, repair, completion, agentic coding
| Model | Size | Active Params | Specialties | Value Rating |
|---|---|---|---|---|
| DeepCoder:1.5b | 1.5B | 1.5B | Code reasoning, RL-tuned | ⭐⭐⭐⭐⭐ |
| Qwen2.5-Coder:3b | 3B | 3B | 40+ languages, code repair | ⭐⭐⭐⭐⭐ |
| Granite3.3:2b | 2B | 2B | FIM support, 128K context | ⭐⭐⭐⭐ |
| RNJ-1 | 8B | 8B | SWE-bench 20.8%, tool use, STEM | ⭐⭐⭐⭐ |
| Qwen2.5-Coder:7b | 7B | 7B | Strong code reasoning | ⭐⭐⭐⭐ |
| DeepCoder:14b | 14B | 14B | O3-mini level (60.6% LiveCodeBench) | ⭐⭐⭐⭐ |
| Devstral | 24B | 24B | #1 open source SWE-bench (46.8%) | ⭐⭐⭐ |
| Devstral-Small-2 | 24B | 24B | SWE-bench 65.8%, agentic | ⭐⭐⭐ |
| Qwen3-Coder:30b | 30B | 3.3B | MoE, 256K native context | ⭐⭐⭐ |
| Qwen2.5-Coder:32b | 32B | 32B | GPT-4o competitive | ⭐⭐ |
| Qwen3-Coder:480b | 480B | ~44B | Top agentic coding model | ⭐ |
Recommendation: qwen2.5-coder:7b or rnj-1 for best balance. deepcoder:14b if you need reasoning-heavy code.
👁️ Vision Models¶
Image understanding, OCR, video, multimodal
| Model | Size | Input Types | Key Features | Value Rating |
|---|---|---|---|---|
| MiniCPM-V | 8B | Image/Video | Beats GPT-4o mini, 1.8M pixels | ⭐⭐⭐⭐⭐ |
| Gemma3:4b | 4B | Image | 128K context, multimodal | ⭐⭐⭐⭐⭐ |
| Qwen3-VL:2b | 2B | Image/Video | 256K context, OCR | ⭐⭐⭐⭐ |
| Qwen3-VL:4b | 4B | Image/Video | Visual coding, spatial | ⭐⭐⭐⭐ |
| DeepSeek-OCR | 8B | Image | Specialized OCR, token-efficient | ⭐⭐⭐⭐ |
| Ministral-3:3b | 3B | Image | Edge deployment, 256K context | ⭐⭐⭐⭐ |
| Qwen2.5-VL:3b | 3B | Image | Edge AI, structured outputs | ⭐⭐⭐⭐ |
| Llama3.2-Vision:11b | 11B | Image | Visual recognition, captioning | ⭐⭐⭐ |
| Qwen3-VL:8b | 8B | Image/Video | Balanced performance | ⭐⭐⭐ |
| Mistral-Small-3.1 | 24B | Image | 128K context, fast inference | ⭐⭐⭐ |
| Gemma3:27b | 27B | Image | 128K context, 140+ languages | ⭐⭐ |
| Qwen3-VL:32b | 32B | Image/Video | Visual agent, OS World top | ⭐⭐ |
| Llama3.2-Vision:90b | 90B | Image | Top Llama vision model | ⭐ |
| Qwen3-VL:235b | 235B | Image/Video | SOTA multimodal | ⭐ |
Recommendation: minicpm-v is exceptional value. qwen3-vl:4b for edge deployment with modern features.
🔍 Embedding Models¶
Vector search, RAG, semantic similarity, clustering
| Model | Params | Active | Dimensions | Languages | Value Rating |
|---|---|---|---|---|---|
| Nomic-Embed-Text | ~137M | ~137M | 768 | English | ⭐⭐⭐⭐⭐ |
| EmbeddingGemma | 300M | 300M | Flexible | 100+ | ⭐⭐⭐⭐ |
| Nomic-Embed-V2-MoE | 475M | 305M | 768/256 | ~100 | ⭐⭐⭐⭐ |
| Benchmark Comparison | BEIR | MIRACL |
|---|---|---|
| Nomic Embed v2 | 52.86 | 65.80 |
| Arctic Embed v2 Base | 55.40 | 59.90 |
| BGE M3 (568M) | 48.80 | 69.20 |
Recommendation: nomic-embed-text for English-only. nomic-embed-text-v2-moe for multilingual with Matryoshka support (256-dim for efficiency).
🔧 Tools / Function Calling / Agentic¶
Tool use, agents, structured outputs, API integration
| Model | Size | Active | Key Capabilities | Value Rating |
|---|---|---|---|---|
| Ministral-3:3b | 3B | 3B | Native function calling, JSON | ⭐⭐⭐⭐⭐ |
| Granite3.3:2b | 2B | 2B | Function calling, RAG | ⭐⭐⭐⭐ |
| Ministral-3:8b | 8B | 8B | Best-in-class edge agentic | ⭐⭐⭐⭐ |
| RNJ-1 | 8B | 8B | BFCL leader, SWE-bench strong | ⭐⭐⭐⭐ |
| Qwen3:4b | 4B | 4B | Tool use in thinking/non-thinking | ⭐⭐⭐⭐ |
| Ministral-3:14b | 14B | 14B | Balanced agent performance | ⭐⭐⭐ |
| Mistral-Small-3.2 | 24B | 24B | Improved function calling | ⭐⭐⭐ |
| Qwen3:30b | 30B | 3B | MoE, top open-source agent | ⭐⭐⭐ |
| Qwen3:235b | 235B | 22B | Leading complex agent tasks | ⭐⭐ |
Recommendation: ministral-3:3b or ministral-3:8b for edge agents. qwen3:30b (MoE) for complex workflows.
🎯 General Purpose / Lightweight¶
Everyday tasks, chat, summarization, Q&A
| Model | Size | Active | Context | Features | Value Rating |
|---|---|---|---|---|---|
| Gemma3:270m | 270M | 270M | 32K | Text only, tiny | ⭐⭐⭐⭐⭐ |
| Gemma3n:e2b | ~5B | 2B | - | Selective activation | ⭐⭐⭐⭐⭐ |
| Gemma3:1b | 1B | 1B | 32K | Text, QAT available | ⭐⭐⭐⭐⭐ |
| Granite3.3:2b | 2B | 2B | 128K | Thinking, multilingual | ⭐⭐⭐⭐ |
| Gemma3n:e4b | ~10B | 4B | - | Phones/tablets/laptops | ⭐⭐⭐⭐ |
| Gemma3:4b | 4B | 4B | 128K | Multimodal, versatile | ⭐⭐⭐⭐ |
| Qwen3:4b | 4B | 4B | - | Rivals Qwen2.5-72B | ⭐⭐⭐⭐ |
| Granite3.3:8b | 8B | 8B | 128K | 12 languages, FIM | ⭐⭐⭐ |
| Gemma3:12b | 12B | 12B | 128K | Multimodal, QAT | ⭐⭐⭐ |
| Olmo 3.1:32b-instruct | 32B | 32B | 64K | Fully open, tools tag | ⭐⭐ |
| Gemma3:27b | 27B | 27B | 128K | 140+ languages | ⭐⭐ |
Recommendation: gemma3:4b best all-around. gemma3n:e2b for extreme resource constraints.
📊 Size Tiers Summary¶
Tiny (< 3B active params) - Great for edge/mobile¶
| Model | Task | Why |
|---|---|---|
| Gemma3:270m | Basic chat | Smallest viable |
| DeepCoder:1.5b | Code reasoning | Punches above weight |
| Granite3.3:2b | General + tools | 128K context |
| Gemma3n:e2b | General | Device-optimized |
| Qwen3-VL:2b | Vision | 256K context |
Small (3-8B) - Sweet spot for most users¶
| Model | Task | Why |
|---|---|---|
| DeepSeek-R1:8b | Reasoning | Best small reasoning |
| RNJ-1 | Code + Tools | SWE-bench killer |
| MiniCPM-V | Vision | Beats GPT-4o mini |
| Ministral-3:8b | Agents | Edge-optimized |
| Qwen2.5-Coder:7b | Coding | 40+ languages |
Medium (14-32B) - Power user territory¶
| Model | Task | Why |
|---|---|---|
| DeepCoder:14b | Code reasoning | O3-mini level |
| Devstral | Agentic coding | #1 open SWE-bench |
| QwQ | Reasoning | o1-mini competitive |
| Qwen3:30b | General + agents | 3B active (MoE) |
| Olmo 3.1 Think | Reasoning | Fully open |
Large (> 32B) - Maximum capability¶
| Model | Task | Why |
|---|---|---|
| DeepSeek-R1:671b | Reasoning | Approaching O3 |
| Qwen3:235b | General | Top-tier all tasks |
| Qwen3-VL:235b | Vision | SOTA multimodal |
| Qwen3-Coder:480b | Coding | Ultimate code agent |
🔥 Hot Takes: Personal Recommendations¶
"I have 8GB VRAM"¶
→ Gemma3:4b (general) or DeepSeek-R1:8b (reasoning) or MiniCPM-V (vision)
"I have 16GB VRAM"¶
→ RNJ-1 (code/tools) or DeepCoder:14b (code reasoning) or Qwen2.5-Coder:14b
"I have 24-32GB VRAM"¶
→ Devstral (agentic coding) or QwQ (reasoning) or Qwen3:30b (general MoE)
"I'm building RAG pipelines"¶
→ Nomic-Embed-Text + Qwen3:4b retriever/generator combo
"I need OCR/document processing"¶
→ DeepSeek-OCR (specialized) or MiniCPM-V (general vision)
"I'm doing local AI dev with limited resources"¶
→ Gemma3n:e2b + Nomic-Embed-Text + tool model of choice
Last updated: Based on Ollama model cards as of late 2025