Docker Compose Service Dependencies: Solving Database Startup Sequence with Healthchecks

Friday night, 10 PM. I’m staring at error logs scrolling through my terminal, mentally calculating whether I’ll make it home on time.
web_1 | Error: connect ECONNREFUSED 127.0.0.1:5432
web_1 | at TCPConnectWrap.afterConnect
db_1 | PostgreSQL init process complete; ready for start up.
web_1 | Exited with code 1
web_1 | Restarting...The application container keeps restarting. The database has started, but it’s always a beat behind. I check docker-compose.yml—depends_on is configured. Why isn’t it working?
This issue has plagued countless developers. You’ve probably experienced it: run docker-compose up in your local development environment, and the first two attempts always fail. You wait about ten seconds for a few restarts before things run normally. When new teammates ask “is this normal?” you can only awkwardly say “try a few more times.”
The root cause is simple: Docker’s depends_on only manages container startup order, not whether services are actually ready.
In this article, I’ll walk you through:
- The three condition configurations for depends_on (90% of people only know the default)
- Correct healthcheck configurations for PostgreSQL and MySQL (with complete examples)
- Modern alternatives to wait-for-it scripts
- A troubleshooting checklist to make your container startups rock solid
Why depends_on Isn’t Enough: Started ≠ Ready
There’s a particularly important quote in the Docker official documentation that’s easy to overlook:
Compose does not wait until a container is “ready”, only until it’s running.
In other words: Compose only waits for the container to run, not whether the service is actually usable.
The Time Gap Between Container Startup and Service Readiness
Imagine the PostgreSQL container startup process:
- 0 seconds: Docker starts the container, postgres process launches ← depends_on releases here
- 2 seconds: Initialize data directory
- 5 seconds: Load configuration files
- 8 seconds: Execute init scripts (if any)
- 12 seconds: Finally ready to accept connections
There’s a 12-second gap. If your web application tries to connect to the database at second 1, the result will inevitably be “Connection refused.”
Real cases are more extreme. I once maintained a legacy project where the database initialization script had to import 500MB of test data—that process alone took 40 seconds. With the default depends_on configuration, the application container had to crash and restart at least 5 times before connecting.
Three Conditions of depends_on
Many people don’t know that depends_on actually supports three conditions:
services:
web:
depends_on:
db:
condition: service_started # Default, container started is OK
# condition: service_healthy # Wait for healthcheck to pass
# condition: service_completed_successfully # Wait for container to exit successfully (for init containers)service_started (default): Continue as long as the container is in running state. This is why depends_on still causes problems.
service_healthy: Must wait for healthcheck to pass and container status to become “healthy” before continuing. This is what we really need.
service_completed_successfully: Wait for container to exit successfully (exit code 0). Suitable for one-time tasks like data migration.
Why Isn’t service_healthy the Default?
You might ask, if service_healthy is so useful, why isn’t it the default?
Two reasons:
- Not all services need healthchecks (like pure stateless workers)
- Healthchecks require your configuration—Docker doesn’t know how your service defines “ready”
This brings us to the next topic: how to configure healthchecks.
Complete Healthcheck Configuration Guide
The healthcheck principle is straightforward: Docker periodically runs a command. If it returns 0, it’s healthy; if it returns 1, it’s unhealthy.
Complete Healthcheck Configuration
services:
db:
image: postgres:16
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"] # Check command
interval: 10s # Check every 10 seconds
timeout: 5s # Single check timeout of 5 seconds
retries: 3 # Mark as unhealthy after 3 failures
start_period: 30s # Failures within 30 seconds after startup don't count toward retriesAll five parameters are important. Let’s go through them one by one.
test: Check Command
Two formats:
# Method 1: Use shell (recommended)
test: ["CMD-SHELL", "pg_isready -U postgres"]
# Method 2: Execute command directly (without shell)
test: ["CMD", "pg_isready", "-U", "postgres"]CMD-SHELL works in most cases because you can use shell features like pipes and redirects.
Common pitfall: Tools in the check command must exist in the image. For example, using curl to check HTTP interfaces when curl isn’t installed in the image means healthcheck will always fail. I’ve hit this snag before—spent half an hour before realizing I needed to add RUN apk add curl to the Dockerfile.
interval: Check Interval
How often to check. Too frequent wastes resources, too slow delays response.
- 10 seconds is a good default for most scenarios
- Critical services like databases can be set to 5 seconds
- Lightweight services can use 15-30 seconds
timeout: Single Check Timeout
Timeout for a single check. If the command hangs, Docker will wait this long before giving up.
Too short causes false positives, too long affects fault detection speed. 5-10 seconds is a safe range.
retries: Failure Retry Count
How many consecutive failures before marking as unhealthy.
This is a debounce mechanism. Occasional network jitter or brief database overload can cause single check failures—retries make the system more robust.
3-5 times is reasonable. retries=1 is too sensitive, retries=10 too sluggish.
start_period: Startup Grace Period (Most Easily Overlooked)
This is the most easily overlooked and most error-prone parameter.
Failures within start_period don’t count toward retries. In other words, it gives the service a “startup buffer.”
Why is it important? Databases need time to start. PostgreSQL initializes data directories, MySQL loads table indexes. Without start_period, healthchecks start failing at second 2, and the service might be marked unhealthy before retries run out.
Recommended values:
- PostgreSQL/MySQL: 30-60 seconds
- Lightweight services (Redis): 15-30 seconds
- With extensive initialization scripts: Can set to 120 seconds
I usually set 60 seconds—better to wait a bit longer than get false positives.
Common Errors and Pitfall Guide
Error 1: Incorrect Environment Variable Reference
# ❌ Wrong: Compose interpolates before startup, container gets host variable value
test: ["CMD", "mysqladmin", "ping", "-p$MYSQL_ROOT_PASSWORD"]
# ✅ Correct: Use $$ escape to let container shell parse
test: ["CMD-SHELL", "mysqladmin ping -p$$MYSQL_ROOT_PASSWORD"]Error 2: start_period Too Short
# ❌ Database not initialized yet but counting failures, quickly marked unhealthy
healthcheck:
test: ["CMD", "pg_isready"]
interval: 5s
retries: 3
start_period: 10s # Too short!
# ✅ Give enough startup time
healthcheck:
start_period: 60s # Much betterError 3: Check Tool Doesn’t Exist
This error is particularly insidious because Docker just fails silently.
# ❌ If image doesn't have curl, healthcheck always fails
test: ["CMD", "curl", "-f", "http://localhost/health"]
# ✅ Ensure tool exists, or use built-in tools
test: ["CMD", "wget", "--spider", "http://localhost/health"] # Alpine images have wgetWith healthcheck configured, let’s see how to configure specific databases.
PostgreSQL Healthcheck Practical Configuration
PostgreSQL’s official image comes with a gem: pg_isready.
This tool is specifically designed to check if PostgreSQL is ready—much more reliable than writing SQL queries yourself.
Basic Configuration (Recommended)
version: '3.8'
services:
web:
image: node:20-alpine
depends_on:
db:
condition: service_healthy # Key: wait for healthcheck to pass
restart: true # Restart app when database restarts
environment:
DATABASE_URL: postgresql://postgres:password@db:5432/myapp
command: npm start
db:
image: postgres:16
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: password
POSTGRES_DB: myapp
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres -d myapp"]
interval: 10s
timeout: 5s
retries: 3
start_period: 60s
volumes:
- postgres_data:/var/lib/postgresql/data
volumes:
postgres_data:pg_isready Command Explained
pg_isready -U postgres -d myapp-U: Specify username, must be an actual existing user-d: Specify database name (optional but recommended)
Why add -U? Without it, pg_isready tries to connect using the current system user, filling logs with warnings. Doesn’t affect functionality, but it’s annoying.
Advanced Configuration: Add Actual Query
pg_isready only checks if the port is reachable, not whether the database can actually execute queries. If you need stricter checking:
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres && psql -U postgres -d myapp -c 'SELECT 1'"]
interval: 10s
timeout: 10s # Note timeout needs to increase due to extra query
retries: 3
start_period: 60sSELECT 1 is the simplest query. If it executes successfully, the database is not only started but can process SQL normally.
However, for most scenarios, basic pg_isready is sufficient.
Actual Running Effect
After configuration, let’s start it:
$ docker-compose up
Creating network "myapp_default" ... done
Creating myapp_db_1 ... done
Waiting for myapp_db_1 to be healthy... ← Notice this line
Creating myapp_web_1 ... done
db_1 | PostgreSQL init process complete; ready for start up.
db_1 | database system is ready to accept connections
web_1 | Server listening on port 3000 ← App starts after database is readyYou’ll see a noticeable pause—Docker is waiting for the db container to become healthy. This process might take 30-60 seconds, but it results in zero failed startups.
Troubleshooting: Container Always unhealthy
If the database container keeps showing unhealthy, check the healthcheck logs:
# View container health status
$ docker inspect --format='{{json .State.Health}}' myapp_db_1 | jq
{
"Status": "unhealthy",
"FailingStreak": 5,
"Log": [
{
"Start": "2024-12-17T03:15:30Z",
"End": "2024-12-17T03:15:30Z",
"ExitCode": 1,
"Output": "pg_isready: could not connect to server: Connection refused"
}
]
}Common causes:
- start_period too short: Database still initializing when failure counting starts
- Wrong username or database name: pg_isready can’t connect
- PostgreSQL startup failed: Check container logs
docker logs myapp_db_1
MySQL Healthcheck Practical Configuration
MySQL healthcheck uses mysqladmin ping, MySQL’s built-in management tool.
Basic Configuration (Recommended)
version: '3.8'
services:
web:
image: node:20-alpine
depends_on:
db:
condition: service_healthy
restart: true
environment:
DATABASE_URL: mysql://root:password@db:3306/myapp
command: npm start
db:
image: mysql:8.0
environment:
MYSQL_ROOT_PASSWORD: password
MYSQL_DATABASE: myapp
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-u", "root", "-ppassword"]
interval: 10s
timeout: 5s
retries: 3
start_period: 60s
volumes:
- mysql_data:/var/lib/mysql
volumes:
mysql_data:mysqladmin ping Command Explained
mysqladmin ping -h localhost -u root -ppassword-h: Host address (use localhost inside container)-u: Username-p: Password (note no space between-pand password)
If MySQL is normal, this command returns:
mysqld is aliveExit code is 0, healthcheck passes.
Correct Password Handling
Method 1: Write Password Directly (Suitable for Development)
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-u", "root", "-ppassword"]Simple and direct, but password is hardcoded in configuration.
Method 2: Use Environment Variable (Recommended)
db:
environment:
MYSQL_ROOT_PASSWORD: password
healthcheck:
test: ["CMD-SHELL", "mysqladmin ping -h localhost -u root -p$$MYSQL_ROOT_PASSWORD"]
# Note: Use $$ not $There’s a pitfall here: must use $$ not $.
Why? Because Docker Compose parses environment variables before startup. If you use $MYSQL_ROOT_PASSWORD, Compose looks for this variable on your host machine, not inside the container. Using $$ tells Compose “leave it alone, let the container shell parse it.”
Method 3: Passwordless Check (Simplest, but Controversial)
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost"]Some MySQL configurations allow local passwordless connections—simplest in this case. However, not recommended for production.
MySQL 8.0 Special Considerations
MySQL 8.0 defaults to the caching_sha2_password authentication plugin, which can prevent some older clients from connecting. If your application reports authentication errors, you can force the old authentication method:
db:
image: mysql:8.0
command: --default-authentication-plugin=mysql_native_password
environment:
MYSQL_ROOT_PASSWORD: password
MYSQL_DATABASE: myapp
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-u", "root", "-ppassword"]
interval: 10s
timeout: 5s
retries: 3
start_period: 60sCommon Issues
Issue 1: Access denied for user ‘root’@‘localhost’
Password is wrong, or environment variable didn’t take effect. Check:
- Is MYSQL_ROOT_PASSWORD spelled correctly
- Does password in healthcheck match
- Did you use
$$escape
Issue 2: Container Starts Slowly, Stays in starting State
MySQL needs time to initialize data directory, especially on first startup. Ensure start_period is set large enough—60 seconds usually works. If you have init scripts importing lots of data, might need 120 seconds or more.
Issue 3: Healthcheck Passes but App Can’t Connect to Database
Could be network issue or application configuration problem. Check:
- Is application connection string correct (use
dbfor hostname, notlocalhost) - Is Docker network configuration normal
- Use
docker network inspectto see if containers are on same network
Healthchecks for Other Databases and Services
Having mastered PostgreSQL and MySQL, other services follow naturally. Here’s a quick reference.
Redis
redis:
image: redis:7-alpine
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 3s
retries: 3
start_period: 15sredis-cli ping returns PONG, exit code 0. Redis starts quickly, start_period can be shorter.
MongoDB
mongo:
image: mongo:7
healthcheck:
test: ["CMD", "mongosh", "--eval", "db.adminCommand('ping')"]
interval: 10s
timeout: 5s
retries: 3
start_period: 30sNote: MongoDB 6.0+ replaced the old mongo command with mongosh. For older versions:
test: ["CMD", "mongo", "--eval", "db.adminCommand('ping')"]RabbitMQ
rabbitmq:
image: rabbitmq:3-management-alpine
healthcheck:
test: ["CMD", "rabbitmq-diagnostics", "ping"]
interval: 10s
timeout: 5s
retries: 3
start_period: 40sRabbitMQ starts relatively slowly, recommend start_period of 40+ seconds.
Generic HTTP Services
If a service provides HTTP healthcheck endpoints (like /health or /ping), use wget or curl:
api:
image: myapp:latest
healthcheck:
test: ["CMD", "wget", "--spider", "--quiet", "http://localhost:8080/health"]
# Or use curl (if available in image)
# test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 10s
timeout: 3s
retries: 3
start_period: 20s--spider makes wget only check without downloading, --quiet suppresses log output.
Note: Ensure image has wget or curl. Alpine images have wget by default, Debian/Ubuntu images have curl.
Services Without Dedicated Tools
If a service lacks healthcheck tools, use netcat (nc) to check ports:
service:
image: some-service:latest
healthcheck:
test: ["CMD-SHELL", "nc -z localhost 9000 || exit 1"]
interval: 10s
timeout: 3s
retries: 3
start_period: 30snc -z only checks if port is open without establishing actual connection. However, this only checks ports, can’t confirm service is truly ready—cruder than previous methods.
wait-for-it Scripts: Still Needed?
If you search for Docker startup order, you’ll probably see many articles recommending wait-for-it or wait-for scripts.
These scripts work by adding wait logic to the application container’s entrypoint, checking if dependent service TCP ports are reachable.
Traditional wait-for-it Approach
web:
image: node:20-alpine
depends_on:
- db # Just regular depends_on, doesn't check health status
volumes:
- ./wait-for-it.sh:/wait-for-it.sh # Mount script
command: ["/wait-for-it.sh", "db:5432", "--", "npm", "start"]wait-for-it.sh loops checking the db:5432 port until connectable before executing npm start.
Why Not Recommended Anymore?
Best practice in 2024: Use native healthcheck instead of scripts whenever possible.
Reasons:
- Clearer configuration: Health check logic is in the database service, dependency relationships are immediately apparent
- No extra files: No need to maintain scripts or mount volumes
- More powerful: Healthcheck can check actual service readiness, TCP port checking is too crude
- Better reusability: Configure healthcheck once, all services depending on it automatically benefit
There’s a popular article in the community literally titled “Forget wait-for-it, use docker-compose healthcheck and depends_on instead.”
When Still Need wait-for-it?
Two special cases:
Case 1: Can’t Modify Image or Compose Configuration
For example, using third-party images without healthcheck and no permission to add it. Adding wait-for-it on the application side is the only choice.
Case 2: Need to Wait for Multiple Services
./wait-for-it.sh db:5432 redis:6379 rabbitmq:5672 -- npm startWhile depends_on can also configure multiple services, wait-for-it is more concise. However, this scenario is rare.
Other Alternative Tools
Besides wait-for-it, there are several similar tools:
- dockerize: Written in Go, more feature-rich, supports environment variable templates
- wait-for: Simplified version of wait-for-it, pure shell implementation
- docker-compose-wait: Written in Python, supports HTTP checking
But honestly, in 2024, just use healthcheck and don’t bother with these.
Troubleshooting Checklist and Best Practices
Still not working after configuration? Follow this checklist—solves 90% of problems.
Quick Diagnostic Commands
# 1. View all container statuses
$ docker-compose ps
NAME COMMAND SERVICE STATUS PORTS
myapp_db postgres db healthy 5432/tcp
myapp_web npm start web running 0.0.0.0:3000->3000/tcp
# 2. View healthcheck details
$ docker inspect --format='{{json .State.Health}}' myapp_db_1 | jq
# 3. View container logs
$ docker-compose logs db
$ docker-compose logs web
# 4. Track logs in real-time
$ docker-compose logs -f --tail=100Common Issues Decision Tree
Issue: Container Always Shows starting, Never Becomes healthy
- Check if start_period too short → Try changing to 60s
- See if healthcheck command correct → Use
docker inspectto see actual command - Enter container and manually execute healthcheck command →
docker exec -it myapp_db_1 pg_isready -U postgres
Issue: Container Becomes unhealthy Then Recovers healthy, Flip-Flopping
- interval too short, insufficient resources → Change to 10s or 15s
- retries too few, occasional failures trigger → Change to 5
- Database truly has performance issues → Check database logs
Issue: Healthcheck Passes but App Still Can’t Connect to Database
- Is application connection string correct → Use service name (like
db) for hostname, notlocalhost - Is port mapping correct → Inter-container communication uses internal port (5432), not mapped port
- Is network configuration correct → Confirm all services on same network
Production Environment Best Practices
1. Recommended Parameter Values (Conservative Configuration)
healthcheck:
interval: 10s # Balance response speed and resource consumption
timeout: 5s # Give command enough execution time
retries: 5 # Tolerate occasional failures
start_period: 60s # Give database sufficient startup timeThis set of parameters is stable in most scenarios. If your database has extensive initialization scripts, start_period can be set to 120s.
2. Use Restart Strategy
web:
depends_on:
db:
condition: service_healthy
restart: true # Restart app when database restarts
restart: unless-stopped # Auto-restart after container exitsrestart: true ensures that when database upgrades or restarts, dependent services also restart and reconnect.
3. Resource Limits
Healthchecks consume resources, though very little. If system resources are tight:
healthcheck:
interval: 30s # Lengthen check interval
timeout: 3s # Shorten timeoutHonestly though, healthcheck overhead is usually negligible unless running hundreds of containers.
4. Monitor Health Status
For production, use monitoring tools to track healthcheck status. Docker healthcheck events can be collected by Prometheus, Grafana, and other tools.
# docker-compose.yml
services:
db:
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 3
start_period: 60s
labels:
- "prometheus.io/scrape=true" # Let Prometheus collect health status5. Multi-Environment Configuration
Development and production configurations can differ:
# docker-compose.yml (development)
db:
healthcheck:
start_period: 30s # Development environment has less data, starts faster
# docker-compose.prod.yml (production)
db:
healthcheck:
start_period: 120s # Production environment has more data, starts slower
interval: 5s # More frequent checksSpecify configuration files when using:
docker-compose -f docker-compose.yml -f docker-compose.prod.yml upDebugging Tips
Tip 1: Manually Test Healthcheck Command
Enter container and manually run healthcheck command to see what’s wrong:
$ docker exec -it myapp_db_1 sh
/# pg_isready -U postgres -d myapp
/var/run/postgresql:5432 - accepting connections
/# echo $?
0 ← Returning 0 means successTip 2: Temporarily Disable Healthcheck
When debugging, comment out healthcheck and use regular depends_on to rule out healthcheck itself:
web:
depends_on:
- db # Temporarily use simple mode
# db:
# condition: service_healthyAfter confirming app can connect to database normally, add back healthcheck.
Tip 3: View Docker Event Logs
Docker records all container events, including healthcheck status changes:
$ docker events --filter 'event=health_status'
2024-12-17T03:15:30.123456789Z container health_status: healthy (name=myapp_db_1)
2024-12-17T03:16:45.987654321Z container health_status: unhealthy (name=myapp_db_1)This helps identify when containers became unhealthy, combine with log timestamps to pinpoint issues.
Conclusion
Back to the article’s opening question: Why doesn’t depends_on work?
The answer in three words: Not enough.
depends_on by default only manages container startup, not service readiness. Database container running doesn’t mean it can accept connections—this gap is the root cause of our troubles.
The solution is also straightforward:
- Configure healthcheck for database, use pg_isready or mysqladmin ping to check actual readiness
- Use service_healthy condition, make application wait for healthcheck to pass
- Set reasonable start_period (60 seconds minimum), give database enough initialization time
These three steps essentially eliminate container startup failures.
Finally, some quick action suggestions:
- Do right now: Copy PostgreSQL or MySQL configuration from this article to your project, modify environment variables and you’re set
- Tonight: Upgrade all depends_on in team projects to service_healthy, once and for all
- Next week’s team meeting: Share with colleagues, standardize team configuration standards
If you encounter issues not covered in this article, feel free to leave comments. Docker Compose has many pitfalls—let’s fill them together.
Finally, if this article helped solve your long-standing startup issues, give it a like so I know. That “finally got it working” moment is what motivates me to write.
11 min read · Published on: Dec 17, 2025 · Modified on: Dec 26, 2025
Related Posts

Docker Container Debugging Guide: The Right Way to Use exec Command

Docker Container Exits Immediately? Complete Troubleshooting Guide (Exit Code 137/1 Solutions)

Comments
Sign in with GitHub to leave a comment