Complete Guide to Ollama Model Management: Download, Switch, Delete & Version Control
Intro
Honestly, when I first started using Ollama, I didn’t realize one thing: model files are huge.
One day, my SSD space warning went off. Opened the ~/.ollama directory—forty-plus GB. I thought: I don’t even use these models! How do I delete them? How do I switch versions? How do I avoid this trap again?
This article is the summary after stumbling into that pit. Let’s talk about Ollama model management’s core commands, from downloading, switching to deleting, then version control best practices. After reading, you should manage your local LLM library calmly, no more disk space scares.
Core Command Reference Table
First, a table for quick reference. We’ll break down each one later.
| Command | Purpose | Example |
|---|---|---|
ollama pull | Download model (specific version) | ollama pull llama3.2:latest |
ollama run | Run model | ollama run llama3.2 |
ollama list | List all local models | ollama list |
ollama ps | Show running models | ollama ps |
ollama stop | Stop running model | ollama stop llama3.2 |
ollama rm | Delete model | ollama rm llama3.2 |
ollama show | Show model details | ollama show llama3.2 |
ollama create | Create custom model | ollama create mymodel -f Modelfile |
ollama serve | Start API server | ollama serve |
This table covers 90% of daily management scenarios. Remember these commands, the rest is just combining them.
Model Download: Version Selection Matters
Download Specific Version
Use ollama pull to download models—most people know this. But one detail is easy to miss: version tags.
# Download latest version
ollama pull llama3.2:latest
# Download specific parameter size
ollama pull llama3.1:70b
# Download specific quantization version
ollama pull mistral:7b-q4_K_M
You might ask: what’s the difference between these tags?
Actually quite simple:
latest: Default tag, latest stable version7b,70b: Parameter size, bigger number means stronger model but more resource-hungryq4_K_M,q3_K_L: Quantization level, smaller number means smaller file, slightly lower precision
One sentence: choose version based on your hardware.
Hardware Requirements Reference Table
Don’t blindly download. Check this table first, match your config:
| Model Size | Parameters | File Size | Min RAM | Recommended RAM | VRAM Needed |
|---|---|---|---|---|---|
| Small | 1-3B | 1-2GB | 4GB | 8GB | 2-4GB |
| Medium | 7-8B | 4-5GB | 8GB | 16GB | 6-8GB |
| Large | 13-14B | 7-8GB | 16GB | 32GB | 10-12GB |
| Extra Large | 70B | 40GB+ | 32GB | 64GB+ | 24GB+ |
My experience: 8GB RAM machine running 7B model barely works, but slow. 16GB feels much better. As for 70B model—honestly, home machines basically can’t run it, unless you have dual RTX 4090s.
Batch Download Script
If downloading multiple models, typing one by one is tiresome. Write a script:
#!/bin/bash
# download_models.sh
MODELS=(
"llama3.2:latest"
"mistral:7b"
"qwen2:7b"
"deepseek-r1:7b"
)
for model in "${MODELS[@]}"; do
echo "Downloading: $model"
ollama pull "$model"
echo "---"
done
echo "All models downloaded!"
Save as download_models.sh, then execute:
chmod +x download_models.sh
./download_models.sh
This way you can pull all needed models at once.
Model Switching: Don’t Let GPU Idle
List Local Models
After downloading, first see what you have:
ollama list
Output looks like this:
NAME ID SIZE MODIFIED
llama3.2:latest a80c4f17acd5 2.0GB 2 days ago
mistral:7b f974a74358d6 4.1GB 5 days ago
qwen2:7b d53d04290064 2.3GB 10 days ago
Field meanings:
- NAME: Model name and tag
- ID: Unique identifier (might need when deleting)
- SIZE: File size,一目了然 who’s hogging space
- MODIFIED: Last modified time, can judge if it’s old model
Seeing this, you should discover: which models are frequently used, which can be deleted.
Run Specific Model
Switching models is simple, just ollama run:
# Basic run
ollama run llama3.2
# With GPU acceleration
ollama run llama3.2 --gpu
# Verbose mode (see loading time)
ollama run llama3.2 --verbose
One detail: if model not downloaded, ollama run will auto pull first. So you can directly run, no need to pre-pull.
But one trap to note: model will keep occupying GPU/CPU after starting, unless you manually stop. So before switching, stop previous model first.
Stop Running Model
First see which models are running:
ollama ps
Output similar to:
NAME ID SIZE PROCESSOR UNTIL
llama3.2 a80c4f17acd5 2.0GB 100% GPU 4 minutes from now
If model is running, stop it:
ollama stop llama3.2
This frees up GPU space, can switch to next model.
Honestly, I used to forget stop, machine got slower and slower. Build a habit: before switching first ps, then stop.
Delete Models: Key to Freeing Disk Space
Delete Single Model
Finally the key part—deleting models.
ollama rm llama3.2
Output:
deleted 'llama3.2'
That simple. But one note: confirm model name before deleting, don’t delete wrong one. Suggest first ollama list to check.
Also, file won’t disappear immediately after deletion, there’s a cleanup process. If space doesn’t free up immediately, wait a few minutes then check.
Batch Deletion Script (Keep Whitelist)
If your model library is messy, want to clean up, but don’t want to delete frequently-used ones—write a script:
#!/bin/bash
# cleanup_models.sh
# Whitelist: these models won't be deleted
KEEP=(
"llama3.2:latest"
"mistral:7b"
)
# Get all models
ALL_MODELS=$(ollama list | tail -n +2 | awk '{print $1}')
for model in $ALL_MODELS; do
# Check if in whitelist
keep=false
for keeper in "${KEEP[@]}"; do
if [ "$model" == "$keeper" ]; then
keep=true
break
fi
done
# Not in whitelist then delete
if [ "$keep" = false ]; then
echo "Deleting: $model"
ollama rm "$model"
else
echo "Keeping: $model"
fi
done
echo "Cleanup done!"
This script’s logic: first list all models, then check each one. Whitelist ones keep, others delete.
You can modify KEEP array based on your needs, fill in your commonly-used models.
Model Storage Path Management
Deleted models, but still not enough space? Maybe path configuration has issues.
By default, models are stored at:
- Windows:
C:\Users\<username>\.ollama\models - Linux/macOS:
~/.ollama/models
If your system drive space is insufficient, can migrate to other drives. Set environment variable OLLAMA_MODELS:
# Linux/macOS
export OLLAMA_MODELS=/path/to/your/models
# Windows (PowerShell)
$env:OLLAMA_MODELS="D:\ollama\models"
After setting, newly downloaded models will be saved to new path. Old models still at original location, need manual move.
My suggestion: set path at the beginning, don’t wait until space explodes then migrate. Migration is troublesome, easy to make mistakes.
Version Control: Strategies to Avoid Chaos
Typical Version Chaos Scenario
You might have encountered this situation:
llama3.1:latest
llama3.1:8b
llama3.2:latest
llama3.2:1b
llama3.2:3b
Stack of versions piled together, you don’t know which is frequently-used, which is for testing, which should be deleted.
Three Suggestions to Avoid Chaos
-
Use tags to distinguish purpose
latest: Daily-use main versiontest,dev: Test versions- Don’t download too many parameter size versions, choose one suitable one
-
Regular cleanup
- Run
ollama listonce a week, see if there’s extra ones - Use previous cleanup script, keep model library lean
- Run
-
Naming should be meaningful
- If customizing models, use clear naming
- Like
myproject-llama3.2, don’t callmymodel1,mymodel2
Update Models (Incremental Download)
Model updated, want to pull new version?
Just pull again:
ollama pull llama3.2:latest
Good news: Ollama only downloads difference parts, won’t re-pull entire file. So updates are fast, no bandwidth worry.
Custom Model Variants
If you want to adjust model parameters (like change temperature, add system prompt), can use Modelfile:
# Create Modelfile
FROM llama3.2
PARAMETER temperature 0.7
SYSTEM """You are a professional tech assistant, answer questions concisely."""
# Create custom model
ollama create my-tech-assistant -f Modelfile
# Run
ollama run my-tech-assistant
This way you have a customized version. If don’t want to use later, same ollama rm to delete.
Common Problems and Solutions
Download Stuck at 99%
I’ve hit this pit too. Model download at final 1%, suddenly stops.
Cause is usually network interruption. Solution:
# Ctrl+C cancel download
# Then re-pull
ollama pull llama3.2
Good news: progress is preserved, no need to download from scratch. Usually second time succeeds.
Space Not Freed After Deletion
Deleted model, df -h shows space still full.
Possible reasons:
- Cleanup process not finished, wait a few minutes
- Residual files exist, manually check path
Manual check:
# View model directory size
du -sh ~/.ollama/models
# If still large, go in and check
ls -lh ~/.ollama/models/blobs/
Find residual files, manually delete.
Model Switch Failed
ollama run error, possibly:
- Insufficient resources: RAM or VRAM not enough
- Configuration error: Modelfile has issues
Solutions:
- First
ollama psconfirm no other models occupying resources - Use
--verbosesee detailed error info - If custom model, check Modelfile syntax
Final Words
Said all this, actually core is just three actions: download, switch, delete. Master these commands, plus some planning (don’t blindly download versions), your local LLM library can be clean.
By the way, if you’re OpenClaw user, model management is even more important—OpenClaw depends on Ollama, model version directly affects application experience. Suggest regularly checking model library, delete unused ones, keep lean.
Questions anytime check that reference table, or flip back to corresponding sections. Hope this article helps you avoid some pits.
FAQ
How to view all currently downloaded models?
Disk space didn't increase immediately after deleting model?
• Use du -sh ~/.ollama/models to check directory size
• Go into blobs directory find residual files
• Manually delete residual model files
How to avoid chaos from downloading too many versions?
Model download stuck at 99% how to solve?
Can 70B model run on home computer?
How to batch update all local models?
ollama list | tail -n +2 | awk '{print $1}' | while read model; do
ollama pull "$model"
done
Ollama only downloads difference parts, updates are fast. Regularly run this script to keep model library latest.
7 min read · Published on: Apr 2, 2026 · Modified on: Apr 5, 2026
Related Posts
Ollama Modelfile Parameters Explained: A Complete Guide to Creating Custom Models
Ollama Modelfile Parameters Explained: A Complete Guide to Creating Custom Models
Ollama + Open WebUI: Build Your Own Local ChatGPT Interface (Complete Guide)
Ollama + Open WebUI: Build Your Own Local ChatGPT Interface (Complete Guide)
Ollama API Calls: From curl to OpenAI SDK Compatible Interface

Comments
Sign in with GitHub to leave a comment