Step 6
Choose Your AI Models
Select models based on your RAM. Always install nomic-embed-text first — it's required for document search.
bash — always install this first
# Required for all setups — enables document search (RAG)
ollama pull nomic-embed-textPick Your RAM Tier
8 GB RAM
Basic
Entry level — light daily use
llama3.2Primary model2 GB
mistralEU alternative4 GB
✅Basic AI chat
✅Document search
⚡Limited reasoning
⚡10–30 sec responses
💡Use cloud AI for heavy tasks
16 GB RAM
Standard
Good for daily use
llama3.1:8bChief of Staff5 GB
llama3.2Fast tasks2 GB
✅Good reasoning
✅Solid document handling
✅5–15 sec responses
32 GB RAM ⭐ Recommended
Sweet Spot
Best balance of quality & speed
qwen2.5:14bChief of Staff9 GB
llama3.1:8bMid-tier5 GB
llama3.2Fast tasks2 GB
✅Excellent orchestration
✅Strong function calling
✅3–10 sec responses
✅~15 GB free for OS + app
64 GB RAM
Power
Near GPT-4 quality, fully local
qwen2.5:14bChief of Staff9 GB
llama3.3:70bHeavy tasks40 GB
llama3.2Fast tasks2 GB
✅GPT-4 class reasoning locally
✅100% private — nothing leaves
⚡70B slow on CPU — needs GPU
Install Commands by Tier
bash — 8 GB RAM setup
ollama pull nomic-embed-text # always required
ollama pull llama3.2 # 2 GB — fast small model
ollama pull mistral # 4 GB — EU alternative (optional)
# .env settings:
OLLAMA_MODEL=llama3.2
OLLAMA_FAST_MODEL=llama3.2
OLLAMA_EMBED_MODEL=nomic-embed-textbash — 16 GB RAM setup
ollama pull nomic-embed-text # always required
ollama pull llama3.1:8b # 5 GB — Chief of Staff
ollama pull llama3.2 # 2 GB — fast tasks
# .env settings:
OLLAMA_MODEL=llama3.1:8b
OLLAMA_FAST_MODEL=llama3.2
OLLAMA_EMBED_MODEL=nomic-embed-textbash — 32 GB RAM setup (recommended)
ollama pull nomic-embed-text # always required
ollama pull qwen2.5:14b # 9 GB — Chief of Staff
ollama pull llama3.1:8b # 5 GB — mid-tier
ollama pull llama3.2 # 2 GB — fast tasks
# .env settings:
OLLAMA_MODEL=qwen2.5:14b
OLLAMA_FAST_MODEL=llama3.2
OLLAMA_EMBED_MODEL=nomic-embed-textbash — 64 GB RAM setup
ollama pull nomic-embed-text # always required
ollama pull qwen2.5:14b # 9 GB — Chief of Staff
ollama pull llama3.3:70b # 40 GB — heavy reasoning (needs GPU)
ollama pull llama3.2 # 2 GB — fast tasks
# .env settings:
OLLAMA_MODEL=qwen2.5:14b
OLLAMA_FAST_MODEL=llama3.2
OLLAMA_EMBED_MODEL=nomic-embed-text