Optional

NVIDIA GPU Drivers

Skip this section if you have no NVIDIA GPU. CPU-only setups work fine.

💡
Why Add a GPU?

An NVIDIA GPU dramatically accelerates AI inference. A 7B model that takes 15 seconds on CPU can respond in under 1 second with a GPU. Ollama automatically detects and uses NVIDIA GPUs via CUDA.

Step 1 — Verify GPU is Detected

bash
# Check if your NVIDIA card is visible
lspci | grep -i nvidia
# Example output: 01:00.0 VGA: NVIDIA Corporation GA106 [RTX 3060] (rev a1)

Step 2 — Clean Any Existing NVIDIA Packages

bash
# Only run if you had a previous failed installation
sudo apt-get remove --purge 'libnvidia-.*'
sudo apt-get remove --purge '^nvidia-.*'
sudo apt-get remove --purge '^cuda-.*'
sudo apt clean && sudo apt autoremove

Step 3 — Install NVIDIA Drivers

bash
# Add graphics drivers PPA
sudo add-apt-repository ppa:graphics-drivers/ppa --yes
sudo apt-get update
update-pciids

# Install driver (570 is current stable — check nvidia.com for latest)
sudo apt-get install nvidia-driver-570 -y
sudo apt-get reinstall linux-headers-$(uname -r)
sudo update-initramfs -u

# Validate DKMS modules
sudo dkms status

# Reboot
sudo reboot

Step 4 — Verify GPU Driver

bash
# After reboot — check GPU status
nvidia-smi
# Should show your GPU name, VRAM, driver version, CUDA version

Step 5 — Install CUDA Toolkit

bash
# Download and install CUDA keyring
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update

# Install CUDA toolkit
sudo apt-get install cuda-toolkit -y
sudo apt-get install nvidia-gds -y

# Verify and reboot
sudo dkms status
sudo reboot

Step 6 — Verify Ollama Detects the GPU

bash
# After reboot — test a model with GPU
ollama run llama3.2 "Hello" --verbose
# Look for "gpu layers" in the output — confirms GPU is being used

# Check which GPU Ollama is using
ollama ps
# PROCESSOR column should show: GPU (not CPU)

GPU Compatibility Table

GPUVRAMModels it HandlesNotes
RTX 3060 / 406012 GB7B–13B models fully in GPUGreat entry GPU for AI
RTX 3090 / 409024 GB13B–34B models in GPUExcellent performance
RTX 4060 Ti16 GB13B–20B models in GPURecommended mid-range
Tesla P10016 GB13B–20B via CUDAGreat used/refurb option
Tesla V100 32G32 GB34B models in GPUUsed servers — great value