Switch Language
Toggle Theme

Complete Guide to Ollama Model Management: Download, Switch, Delete & Version Control

Intro

Honestly, when I first started using Ollama, I didn’t realize one thing: model files are huge.

One day, my SSD space warning went off. Opened the ~/.ollama directory—forty-plus GB. I thought: I don’t even use these models! How do I delete them? How do I switch versions? How do I avoid this trap again?

This article is the summary after stumbling into that pit. Let’s talk about Ollama model management’s core commands, from downloading, switching to deleting, then version control best practices. After reading, you should manage your local LLM library calmly, no more disk space scares.


Core Command Reference Table

First, a table for quick reference. We’ll break down each one later.

CommandPurposeExample
ollama pullDownload model (specific version)ollama pull llama3.2:latest
ollama runRun modelollama run llama3.2
ollama listList all local modelsollama list
ollama psShow running modelsollama ps
ollama stopStop running modelollama stop llama3.2
ollama rmDelete modelollama rm llama3.2
ollama showShow model detailsollama show llama3.2
ollama createCreate custom modelollama create mymodel -f Modelfile
ollama serveStart API serverollama serve

This table covers 90% of daily management scenarios. Remember these commands, the rest is just combining them.


Model Download: Version Selection Matters

Download Specific Version

Use ollama pull to download models—most people know this. But one detail is easy to miss: version tags.

# Download latest version
ollama pull llama3.2:latest

# Download specific parameter size
ollama pull llama3.1:70b

# Download specific quantization version
ollama pull mistral:7b-q4_K_M

You might ask: what’s the difference between these tags?

Actually quite simple:

  • latest: Default tag, latest stable version
  • 7b, 70b: Parameter size, bigger number means stronger model but more resource-hungry
  • q4_K_M, q3_K_L: Quantization level, smaller number means smaller file, slightly lower precision

One sentence: choose version based on your hardware.

Hardware Requirements Reference Table

Don’t blindly download. Check this table first, match your config:

Model SizeParametersFile SizeMin RAMRecommended RAMVRAM Needed
Small1-3B1-2GB4GB8GB2-4GB
Medium7-8B4-5GB8GB16GB6-8GB
Large13-14B7-8GB16GB32GB10-12GB
Extra Large70B40GB+32GB64GB+24GB+

My experience: 8GB RAM machine running 7B model barely works, but slow. 16GB feels much better. As for 70B model—honestly, home machines basically can’t run it, unless you have dual RTX 4090s.

Batch Download Script

If downloading multiple models, typing one by one is tiresome. Write a script:

#!/bin/bash
# download_models.sh

MODELS=(
  "llama3.2:latest"
  "mistral:7b"
  "qwen2:7b"
  "deepseek-r1:7b"
)

for model in "${MODELS[@]}"; do
  echo "Downloading: $model"
  ollama pull "$model"
  echo "---"
done

echo "All models downloaded!"

Save as download_models.sh, then execute:

chmod +x download_models.sh
./download_models.sh

This way you can pull all needed models at once.


Model Switching: Don’t Let GPU Idle

List Local Models

After downloading, first see what you have:

ollama list

Output looks like this:

NAME                ID              SIZE    MODIFIED
llama3.2:latest     a80c4f17acd5    2.0GB   2 days ago
mistral:7b          f974a74358d6    4.1GB   5 days ago
qwen2:7b            d53d04290064    2.3GB   10 days ago

Field meanings:

  • NAME: Model name and tag
  • ID: Unique identifier (might need when deleting)
  • SIZE: File size,一目了然 who’s hogging space
  • MODIFIED: Last modified time, can judge if it’s old model

Seeing this, you should discover: which models are frequently used, which can be deleted.

Run Specific Model

Switching models is simple, just ollama run:

# Basic run
ollama run llama3.2

# With GPU acceleration
ollama run llama3.2 --gpu

# Verbose mode (see loading time)
ollama run llama3.2 --verbose

One detail: if model not downloaded, ollama run will auto pull first. So you can directly run, no need to pre-pull.

But one trap to note: model will keep occupying GPU/CPU after starting, unless you manually stop. So before switching, stop previous model first.

Stop Running Model

First see which models are running:

ollama ps

Output similar to:

NAME        ID              SIZE    PROCESSOR   UNTIL
llama3.2    a80c4f17acd5    2.0GB   100% GPU    4 minutes from now

If model is running, stop it:

ollama stop llama3.2

This frees up GPU space, can switch to next model.

Honestly, I used to forget stop, machine got slower and slower. Build a habit: before switching first ps, then stop.


Delete Models: Key to Freeing Disk Space

Delete Single Model

Finally the key part—deleting models.

ollama rm llama3.2

Output:

deleted 'llama3.2'

That simple. But one note: confirm model name before deleting, don’t delete wrong one. Suggest first ollama list to check.

Also, file won’t disappear immediately after deletion, there’s a cleanup process. If space doesn’t free up immediately, wait a few minutes then check.

Batch Deletion Script (Keep Whitelist)

If your model library is messy, want to clean up, but don’t want to delete frequently-used ones—write a script:

#!/bin/bash
# cleanup_models.sh

# Whitelist: these models won't be deleted
KEEP=(
  "llama3.2:latest"
  "mistral:7b"
)

# Get all models
ALL_MODELS=$(ollama list | tail -n +2 | awk '{print $1}')

for model in $ALL_MODELS; do
  # Check if in whitelist
  keep=false
  for keeper in "${KEEP[@]}"; do
    if [ "$model" == "$keeper" ]; then
      keep=true
      break
    fi
  done

  # Not in whitelist then delete
  if [ "$keep" = false ]; then
    echo "Deleting: $model"
    ollama rm "$model"
  else
    echo "Keeping: $model"
  fi
done

echo "Cleanup done!"

This script’s logic: first list all models, then check each one. Whitelist ones keep, others delete.

You can modify KEEP array based on your needs, fill in your commonly-used models.

Model Storage Path Management

Deleted models, but still not enough space? Maybe path configuration has issues.

By default, models are stored at:

  • Windows: C:\Users\<username>\.ollama\models
  • Linux/macOS: ~/.ollama/models

If your system drive space is insufficient, can migrate to other drives. Set environment variable OLLAMA_MODELS:

# Linux/macOS
export OLLAMA_MODELS=/path/to/your/models

# Windows (PowerShell)
$env:OLLAMA_MODELS="D:\ollama\models"

After setting, newly downloaded models will be saved to new path. Old models still at original location, need manual move.

My suggestion: set path at the beginning, don’t wait until space explodes then migrate. Migration is troublesome, easy to make mistakes.


Version Control: Strategies to Avoid Chaos

Typical Version Chaos Scenario

You might have encountered this situation:

llama3.1:latest
llama3.1:8b
llama3.2:latest
llama3.2:1b
llama3.2:3b

Stack of versions piled together, you don’t know which is frequently-used, which is for testing, which should be deleted.

Three Suggestions to Avoid Chaos

  1. Use tags to distinguish purpose

    • latest: Daily-use main version
    • test, dev: Test versions
    • Don’t download too many parameter size versions, choose one suitable one
  2. Regular cleanup

    • Run ollama list once a week, see if there’s extra ones
    • Use previous cleanup script, keep model library lean
  3. Naming should be meaningful

    • If customizing models, use clear naming
    • Like myproject-llama3.2, don’t call mymodel1, mymodel2

Update Models (Incremental Download)

Model updated, want to pull new version?

Just pull again:

ollama pull llama3.2:latest

Good news: Ollama only downloads difference parts, won’t re-pull entire file. So updates are fast, no bandwidth worry.

Custom Model Variants

If you want to adjust model parameters (like change temperature, add system prompt), can use Modelfile:

# Create Modelfile
FROM llama3.2
PARAMETER temperature 0.7
SYSTEM """You are a professional tech assistant, answer questions concisely."""

# Create custom model
ollama create my-tech-assistant -f Modelfile

# Run
ollama run my-tech-assistant

This way you have a customized version. If don’t want to use later, same ollama rm to delete.


Common Problems and Solutions

Download Stuck at 99%

I’ve hit this pit too. Model download at final 1%, suddenly stops.

Cause is usually network interruption. Solution:

# Ctrl+C cancel download
# Then re-pull
ollama pull llama3.2

Good news: progress is preserved, no need to download from scratch. Usually second time succeeds.

Space Not Freed After Deletion

Deleted model, df -h shows space still full.

Possible reasons:

  1. Cleanup process not finished, wait a few minutes
  2. Residual files exist, manually check path

Manual check:

# View model directory size
du -sh ~/.ollama/models

# If still large, go in and check
ls -lh ~/.ollama/models/blobs/

Find residual files, manually delete.

Model Switch Failed

ollama run error, possibly:

  1. Insufficient resources: RAM or VRAM not enough
  2. Configuration error: Modelfile has issues

Solutions:

  • First ollama ps confirm no other models occupying resources
  • Use --verbose see detailed error info
  • If custom model, check Modelfile syntax

Final Words

Said all this, actually core is just three actions: download, switch, delete. Master these commands, plus some planning (don’t blindly download versions), your local LLM library can be clean.

By the way, if you’re OpenClaw user, model management is even more important—OpenClaw depends on Ollama, model version directly affects application experience. Suggest regularly checking model library, delete unused ones, keep lean.

Questions anytime check that reference table, or flip back to corresponding sections. Hope this article helps you avoid some pits.


FAQ

How to view all currently downloaded models?
Use ollama list command, it will list all local models' names, IDs, sizes and modification times. To see specific model details, use ollama show model-name.
Disk space didn't increase immediately after deleting model?
There's a cleanup process after deletion, wait a few minutes then check. If space still not freed:

• Use du -sh ~/.ollama/models to check directory size
• Go into blobs directory find residual files
• Manually delete residual model files
How to avoid chaos from downloading too many versions?
Only use latest tag as daily main version, test versions use test or dev tags to distinguish. Don't download too many different parameter size versions, choose one suitable based on hardware. Regularly use ollama list to check, clean unused old versions.
Model download stuck at 99% how to solve?
Usually caused by network interruption. Ctrl+C cancel download, then re-run ollama pull command. Progress is preserved, no need to download from scratch, usually second time succeeds.
Can 70B model run on home computer?
Basically can't run. 70B model needs 32GB+ RAM and at least 24GB VRAM, ordinary home computers (including single RTX 4090) can't handle it. Suggest using 7B or 14B models, or rent cloud server to run large models.
How to batch update all local models?
Write a simple script:

ollama list | tail -n +2 | awk '{print $1}' | while read model; do
ollama pull "$model"
done

Ollama only downloads difference parts, updates are fast. Regularly run this script to keep model library latest.

7 min read · Published on: Apr 2, 2026 · Modified on: Apr 5, 2026

Comments

Sign in with GitHub to leave a comment

Related Posts