Docker Image Optimization in Action: Slimming Down from 1GB to 100MB
Monday morning, 9 AM. The CI/CD pipeline’s red light is flashing. I’m staring at a build that’s been running for 8 minutes. This deployment only changed a few lines of code, but the image upload alone takes 5 minutes—because the image is a whopping 1.2GB.
My boss messages in the group chat: “Why isn’t it live yet?”
I screenshot the build status and send it: Image uploading, 23% progress.
In that moment, I made a decision: This has to stop.
Later, I applied a series of optimizations to this Node.js application’s Docker image. The result? 98MB. Build time dropped from 8 minutes to 2 minutes. That’s a 10x improvement.
In this article, I’ll share these hands-on experiences with you. No fluff—every step has data comparisons, every code snippet is ready to use.
Why Is Your Image So Large?
Honestly, when I first ran docker history on that 1.2GB image, I was a bit confused.
docker history my-app:latest
The output looked something like this:
IMAGE CREATED SIZE
abc123def456 2 hours ago 850MB # npm install artifacts
def456abc123 2 hours ago 180MB # base image node:18
...
850MB just in the npm install layer. I paused for a moment—how did this get so huge?
Four Culprits Behind Image Bloat
Problem #1: Wrong base image choice.
My Dockerfile started with:
FROM node:18
The node:18 image is based on Debian and comes with a lot of things I don’t need: package managers, system utilities, development libraries. The base image alone is 900MB.
Check out this comparison of official images:
| Image | Size |
|---|---|
| node:18 | ~900MB |
| node:18-slim | ~230MB |
| node:18-alpine | ~170MB |
| alpine:3.18 | ~5.5MB |
| distroless/static | ~2MB |
See that? Just switching to node:18-alpine saves 730MB.
Problem #2: Build tools left behind.
I installed gcc, make, and python in the image because some npm packages need to be compiled. But here’s the thing—after compilation, I just left them there, eating up space in the image.
Problem #3: Layer stacking effect.
Each RUN instruction creates a new layer. My Dockerfile had a dozen RUN commands, and each layer carries all the files from previous layers. Deleted files are still there—they’re just “covered up.”
Problem #4: Poor cache optimization.
I put COPY . . at the very beginning, which means—every time I change one line of code, the entire image has to be rebuilt. npm install runs every time, downloading all dependencies again.
See It Clearly with dive
docker history alone isn’t intuitive enough. I recommend a tool called dive that can “peel apart” each layer of your image.
Installation is simple:
# macOS
brew install dive
# Linux
wget https://github.com/wagoodman/dive/releases/download/v0.12.0/dive_0.12.0_linux_amd64.deb
sudo dpkg -i dive_0.12.0_linux_amd64.deb
Then analyze your image:
dive my-app:latest
You’ll see an interactive interface with layers on the left and file changes on the right. Use arrow keys to navigate layers and clearly see which files were added, deleted, or modified in each layer.
The first time I used it, I discovered: node_modules appeared twice—once in /app/node_modules and once in a build stage temporary directory. How much space was that wasting?
5-Step Optimization Framework
Alright, problems identified. Now for solutions.
I’ve organized these methods into a 5-step framework, each with specific results data.
Step 1: Choose a Lightweight Base Image
This is the easiest step—change one line of code and see results.
# Before
FROM node:18
# After
FROM node:18-alpine
The effect? 900MB → 170MB. That’s 730MB saved, just by switching the base image.
How to choose between three lightweight options?
| Image Type | Use Case | Pros & Cons |
|---|---|---|
| Alpine | General purpose | Small, rich package manager, but musl may have compatibility issues |
| Distroless | Security-first | Minimal, no shell, but difficult to debug |
| Scratch | Statically compiled languages (Go, Rust) | Smallest (0MB), requires static compilation |
In most cases, Alpine is a solid choice. If you’re using the official Node.js -alpine images, glibc compatibility issues are largely resolved.
Step 2: Use Multi-Stage Builds
This is a game-changer. The principle is simple: separate the build environment from the runtime environment, and the final image only keeps what’s needed to run.
# Build stage
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
# Runtime stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/main.js"]
The key is the COPY --from=builder line—it only “steals” the files you need from the builder stage.
My 1.2GB image, after adding multi-stage builds, dropped to 200MB. The build stage’s 850MB of node_modules and various tools were all discarded.
Step 3: Optimize Layer Caching
Docker’s layer caching works like this: if a layer hasn’t changed, use the cache.
The problem was, my Dockerfile had the wrong order:
# Wrong approach
COPY . . # Copy all files first
RUN npm install # Then install dependencies
Written this way: change one line of code → COPY . . changes → cache invalidates → npm install runs again.
The correct approach:
# Correct approach
COPY package*.json ./ # Copy dependency files first
RUN npm install # Install dependencies (cache reused if deps unchanged)
COPY . . # Copy source code last (source changes often, put it last)
After this change, as long as package.json doesn’t change, npm install uses the cache directly. Build time dropped from 3 minutes to 30 seconds.
Another tip: merge RUN instructions.
# Before (4 layers)
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean
RUN rm -rf /var/lib/apt/lists/*
# After (1 layer)
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
Why? Because each RUN is a layer. Files you “delete” are still in previous layers. Merging into one instruction prevents intermediate files from entering the image.
Step 4: Configure .dockerignore
This step is often overlooked. When you run docker build, Docker packages the entire directory and sends it to the daemon. If you have 500MB of node_modules locally, it all gets packaged up.
Create .dockerignore in your project root:
.git
node_modules
*.log
.env
docker-compose.yml
README.md
.vscode
tests
coverage
The effect? Build context drops from 500MB to 50MB. The docker build command runs much faster.
Step 5: Clean Up Unnecessary Files
If you must install packages with apt, remember to clean up:
RUN apt-get update && \
apt-get install -y --no-install-recommends \
curl \
ca-certificates && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
Key points:
--no-install-recommendsskips “recommended” packages, saving significant spaceapt-get cleanclears the apt cacherm -rf /var/lib/apt/lists/*deletes package lists
After this step, you save another 50-100MB.
Real-World Cases: Complete Optimization for 3 Languages
Talk is cheap. I’ve prepared complete Dockerfiles for three languages—copy and run them directly.
Node.js Application
Starting point: node:18 base image, 900MB
# Optimized complete Dockerfile
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER node
CMD ["node", "dist/main.js"]
Optimization path:
- Switch to alpine: 900MB → 170MB
- Multi-stage build: 170MB → 120MB
- Production-only dependencies (
npm ci --only=production): 120MB → 98MB
Go Application
Go is a statically compiled language, naturally suited for Docker optimization.
# Build stage
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.* ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main .
# Runtime stage (using scratch empty image)
FROM scratch
COPY --from=builder /app/main /main
ENTRYPOINT ["/main"]
Result: 800MB → under 10MB.
You read that right. scratch is an empty image containing only your compiled binary. Go’s static compilation doesn’t depend on any dynamic libraries—it just runs.
Python Application
Python is a bit more complex because Alpine uses musl instead of glibc, which can cause issues with some packages.
FROM python:3.11-alpine AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user -r requirements.txt
COPY . .
FROM python:3.11-alpine
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY --from=builder /app .
ENV PATH=/root/.local/bin:$PATH
CMD ["python", "main.py"]
Important notes:
- If
pip installfails, you may need to installmusl-devandgcc - Some scientific computing packages (numpy, pandas) may have performance issues on Alpine
- For stability, consider
python:3.11-slim(Debian-based)
Common Pitfalls and Solutions
I’ve stepped into plenty of traps during optimization. Let me warn you ahead of time.
Pitfall 1: Alpine’s musl Compatibility Issues
Symptom: An npm package fails to install, reporting missing glibc.
Cause: Alpine uses musl libc, not standard glibc. Some packages depend on glibc.
Solutions:
- For Node.js: Use official
node:18-alpineimages, most issues are already handled - For Python: If installation fails, use
debian:slim - If you must use Alpine: Try the
apk add gcompatcompatibility layer
Pitfall 2: Scratch Images Can’t Be Debugged
Symptom: docker exec -it container sh fails because scratch has no shell.
Solutions:
- Use scratch for production, alpine for debugging stages
- Or build a separate debug image:
FROM alpine COPY --from=production /app /app CMD ["sh"]
Pitfall 3: CI/CD Cache Loss
Symptom: Local builds are fast, but CI starts from scratch every time.
Cause: CI environments don’t preserve Docker cache.
Solution: Use BuildKit’s cache mount:
# syntax=docker/dockerfile:1
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm \
npm install
This way npm cache persists, even in CI.
Pitfall 4: Over-aggressive .dockerignore Exclusion
Symptom: Build fails, complaining about missing files.
Cause: .dockerignore excluded files you actually need.
Solutions:
- Progressive exclusion—start with obvious ones (node_modules, .git)
- Use
docker build --no-cacheto verify clean builds - For exceptions, use
!:tests !tests/fixtures
Verification and Continuous Improvement
Optimization done—how do you verify the results?
Verification Methods
1. Check image size
docker images my-app
2. View layer details
docker history my-app:latest --no-trunc
3. Visual analysis
dive my-app:latest
Performance Trade-offs
Smaller images—but will there be runtime issues?
Alpine’s musl vs glibc:
- musl is lighter, but slightly slower in some scenarios
- If your application makes heavy system calls, benchmark and compare
Scratch’s security:
- Minimal attack surface, most secure
- But when problems arise, you can’t enter the container to troubleshoot
My recommendation: From a security perspective, use scratch whenever possible. Switch to alpine for debugging. In CI, build both images—the one with -debug suffix is the debug version.
Continuous Improvement Suggestions
- Regularly update base images: Check once a month—security fixes are frequent
- Monitor image size: Add a check in CI, alert if over 100MB
- Use hadolint: Dockerfile static analysis tool, catches issues early
docker run --rm -i hadolint/hadolint < Dockerfile
Recommended Tools
| Tool | Purpose | Installation |
|---|---|---|
| dive | Image analysis | brew install dive |
| hadolint | Dockerfile linting | brew install hadolint |
| docker-slim | Auto-slimming | brew install docker-slim |
docker-slim is interesting—it analyzes which files your image actually uses at runtime, then deletes everything else. But try it in a test environment first—don’t break production.
Docker Image Optimization 5-Step Method
Complete optimization process from analysis to verification, helping developers reduce Docker images from 1GB to 100MB
⏱️ Estimated time: 30 min
- 1
Step1: Choose lightweight base image
Switch base image from full version to slim version:
• node:18 → node:18-alpine (900MB → 170MB)
• golang:1.22 → golang:1.22-alpine
• python:3.11 → python:3.11-alpine
Note: Alpine uses musl libc, packages depending on glibc may need extra handling - 2
Step2: Use multi-stage builds
Separate build and runtime environments:
• Build stage: Install compilation tools, build dependencies
• Runtime stage: Copy only build artifacts and runtime dependencies
• Use FROM ... AS builder to define build stage
• Use COPY --from=builder to copy build artifacts - 3
Step3: Optimize layer cache order
Adjust Dockerfile instruction order to maximize cache reuse:
• First copy dependency files (package.json, requirements.txt)
• Then install dependencies (npm install, pip install)
• Finally copy source code (COPY . .)
Merge multiple RUN instructions to avoid intermediate file residue - 4
Step4: Configure .dockerignore
Create .dockerignore file in project root to exclude unnecessary files:
• .git, node_modules, *.log
• .env, .vscode, tests
• docker-compose.yml, README.md
This greatly reduces build context and improves build speed - 5
Step5: Clean up unnecessary files
Clean caches and temporary files in RUN instructions:
• Use --no-install-recommends to avoid installing recommended packages
• Use apt-get clean to clean package cache
• Use rm -rf /var/lib/apt/lists/* to delete package lists
• Clean /tmp/* and /var/tmp/*
Summary
After all that, the core is just 5 steps:
- Choose the right base image: Use alpine when possible, use scratch for static compilation
- Multi-stage builds: Separate build from runtime, keep only what’s needed
- Optimize layer caching: Dependency files first, source code last
- Configure .dockerignore: Exclude unnecessary files
- Clean up residual files: Delete apt caches, temporary files
Go check your Docker images now. Use docker history to see which layer takes the most space, then try the methods in this article.
See you in the comments—tell me your optimization results: from how many MB down to how many MB?
FAQ
What's the difference between Alpine and Debian base images?
Does multi-stage build affect build speed?
What scenarios is Scratch image suitable for?
How to solve Alpine's glibc compatibility issues?
How to preserve Docker cache in CI/CD environments?
9 min read · Published on: Mar 20, 2026 · Modified on: Mar 20, 2026
Related Posts
From Copilot to Antigravity: Mastering the Agent-First Development Paradigm
From Copilot to Antigravity: Mastering the Agent-First Development Paradigm
AI Keeps Writing Wrong Code? Master These 5 Prompt Techniques to Boost Efficiency by 50%
AI Keeps Writing Wrong Code? Master These 5 Prompt Techniques to Boost Efficiency by 50%
Cursor Advanced Tips: 10 Practical Methods to Double Development Efficiency (2026 Edition)

Comments
Sign in with GitHub to leave a comment