Nginx Reverse Proxy Complete Guide: Upstream, Buffering, and Timeout

3 AM. Phone vibrating like crazy—production alert.

Checking the logs: all 502 Bad Gateway errors. Backend service didn’t crash, but Nginx timeout was set too short. Traffic spike hit, and requests got cut off before they finished processing. I stared at that proxy_read_timeout 60s line and thought: I literally pulled that number out of thin air.

That incident sent me down a week-long rabbit hole understanding three core Nginx reverse proxy modules: upstream load balancing, proxy buffer configuration, and timeout settings. Honestly, when these three are configured right, your reverse proxy can handle 10x traffic. When they’re wrong, you get 3 AM alerts like I did.

This article captures all the pitfalls I hit, debugging lessons learned, and the principles I finally grokked. If you’re doing backend, DevOps, or just want to understand what those Nginx config parameters actually mean, this should save you some time.

Upstream Load Balancing: More Than “Distributing Requests”

Let’s Start with Basic Syntax

The upstream block is Nginx load balancing’s core. You’ve probably seen this:

upstream backend {
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
    server 192.168.1.12:8080;
}

server {
    location / {
        proxy_pass http://backend;
    }
}

Looks simple—define backend servers, proxy_pass to them. But honestly, that’s not enough for production. Real environments need more: what if a server crashes? Can beefier machines get more traffic? Should we keep connections alive?

Four Load Balancing Algorithms, Each for Different Scenarios

Nginx defaults to round-robin—distribute in order. Fair, but not smart.

If your backend handles long connections—WebSocket, database connection pools—round-robin might suddenly overload certain servers. Least connections (least_conn) is better here:

upstream backend {
    least_conn;
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
}

It tracks active connections per server, sending new requests to the least busy one. I had a WebSocket project pushing real-time messages—round-robin caused one server’s memory to explode. Switching to least_conn balanced the load properly.

Another scenario you might know: user logs in, subsequent requests must hit the same server (session stored locally). IP Hash handles this:

upstream backend {
    ip_hash;
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
}

Same client IP gets hashed to a fixed backend. But honestly, this has flaws—if that server dies, the session is gone. Better approach is Redis for sessions, using ip_hash as a temporary fix.

Fourth is consistent hashing (hash), common for distributed caching:

upstream backend {
    hash $request_uri consistent;
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
}

Nginx creates 160 virtual nodes per weight unit, hashing request URIs to specific servers. Benefit: high cache hit rate—same URI always hits the same machine.

Weight Configuration: When Machines Have Different Specs

Backend servers with different specs is common. Some have 32GB RAM, 8 CPU cores; others 16GB, 4 cores. Fair round-robin? That wastes the beefier machines.

upstream backend {
    server 192.168.1.10:8080 weight=3;
    server 192.168.1.11:8080 weight=2;
    server 192.168.1.12:8080 weight=1;
}

weight=3 gets triple the requests. Better machines do more work, weaker ones do less—that makes sense.

There’s also backup, standby server:

upstream backend {
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
    server 192.168.1.12:8080 backup;
}

backup doesn’t participate normally, only activates when the primaries fail. Like a bench player—only plays when starters are out.

Keepalive Connection Pool: The Secret to Doubling Performance

This one gets overlooked. Nginx default behavior: create a new TCP connection to backend per request, close after response. Sounds fine? It’s not.

TCP connection needs three-way handshake to establish, four-way handshake to close. High concurrency, this overhead is brutal. Keepalive connection pool reuses connections, eliminating this cost.

Example config:

upstream backend {
    server 192.168.1.10:8080;
    keepalive 32;  # Each worker keeps 32 idle connections
}

server {
    location / {
        proxy_pass http://backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

Two things to note:

keepalive 32 sets max idle connections per worker process
Must set proxy_http_version 1.1 and Connection ""—HTTP/1.0 doesn’t support persistent connections

I tested an API service—without keepalive, QPS around 2000. With it, 4000+. Doubling isn’t hype, it’s real.

QPS Performance Boost

来源: 实测数据：启用 keepalive 连接池后

But don’t set keepalive too high. I once set it to 100 in test environment with only 1 ECS container backend—it got crushed by connection count. Production formula:

keepalive ≈ Total QPS × Avg response time ÷ Worker process count

Say you expect QPS 10000, avg response 50ms, 4 workers:

10000 × 0.05 ÷ 4 = 125

keepalive around 125 makes sense.

Health Checks: Auto-Remove Dead Servers

Nginx open source only has passive health checks—marks unhealthy after request fails:

upstream backend {
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
}

server {
    location / {
        proxy_pass http://backend;
        proxy_next_upstream error timeout http_502 http_503 http_504;
        proxy_next_upstream_tries 3;
    }
}

proxy_next_upstream defines when to retry next server: connection error, timeout, or 502/503/504. proxy_next_upstream_tries 3 means max 3 attempts.

But passive checks have delay—you only discover a dead server after a request fails. If availability matters, NGINX Plus active health checks are better:

upstream backend {
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
}

server {
    location / {
        proxy_pass http://backend;
        health_check interval=5s fails=3 passes=2;
    }
}

Every 5 seconds, actively send health check requests. 3 consecutive failures marks unhealthy, 2 consecutive successes restores it.

Proxy Buffer: Helper or Trouble Maker

What Buffering Actually Does

Concept: Nginx doesn’t send backend response directly to client—it buffers first.

Why? Client network speed is unpredictable. Backend might output data fast, but if client is slow, Nginx gets stuck waiting. With buffer, Nginx stores response once, then slowly sends to client—backend doesn’t wait, can handle next request sooner.

But buffering costs: memory consumption. Large responses, high concurrency—memory usage gets real.

Three Core Parameters, Understanding Their Relationships

proxy_buffer_size 4k;
proxy_buffers 8 32k;
proxy_busy_buffers_size 64k;

These three confused me at first—similar names, tangled meanings. Had to draw a diagram:

proxy_buffer_size: buffer for response headers, one per request
proxy_buffers: buffer array for response body, format is count size_each
proxy_busy_buffers_size: buffers currently sending to client, can’t exceed half of total buffers

Example: proxy_buffers 8 32k, total is 8×32k = 256k. proxy_busy_buffers_size 64k, quarter, follows the rule.

When to adjust these?

If backend response headers are huge (lots of cookies), you might see “upstream sent too big header”. Fix: increase proxy_buffer_size:

proxy_buffer_size 16k;

If response bodies are often large (big JSON payloads), increase buffers:

proxy_buffers 16 64k;

Special Cases: Disable Buffering

Sometimes buffering causes problems.

Server-Sent Events (SSE): backend continuously pushes event stream. If Nginx buffers, client gets delayed messages. Config needs buffering off:

location /events {
    proxy_pass http://backend;
    proxy_buffering off;
    proxy_cache off;
    proxy_read_timeout 86400s;
}

proxy_read_timeout 86400s (a day) because SSE is long-lived, can’t timeout.

WebSocket: Similar, bidirectional real-time:

location /ws {
    proxy_pass http://backend;
    proxy_buffering off;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
    proxy_read_timeout 86400s;
}

Large file upload: Client uploads 1GB, if Nginx buffers everything before forwarding, memory explodes. Disable request buffering:

location /upload {
    proxy_pass http://backend;
    proxy_request_buffering off;
    client_max_body_size 1G;
}

proxy_request_buffering off makes Nginx stream directly—receive and forward simultaneously.

Timeout: The Logic Behind Config Values

Three Timeout Parameters, Each with Its Own Job

proxy_connect_timeout 10s;
proxy_read_timeout 60s;
proxy_send_timeout 60s;

Names look similar? They have distinct roles:

proxy_connect_timeout: Time Nginx waits to establish TCP connection. If backend is slow (network congestion, firewall), exceeding this aborts.
proxy_read_timeout: After connection, time Nginx waits for backend data. Interval between two read operations exceeding this is timeout.
proxy_send_timeout: Time limit for Nginx sending request body to backend.

Common confusion: proxy_read_timeout isn’t total timeout—it’s interval between reads. If backend takes 5 minutes but sends heartbeats during processing, proxy_read_timeout 60s works. If backend is silent for 5 minutes, need proxy_read_timeout 300s.

Timeout vs 502/504 Relationship

The 3 AM alert taught me one crucial lesson:

502 Bad Gateway: Nginx couldn’t connect to backend—service down, port unreachable, firewall blocking
504 Gateway Timeout: Nginx connected, but backend took too long to respond

Example: proxy_connect_timeout 10s, if backend takes 15s to accept connection, Nginx returns 502. But if connection establishes fast, backend processes for 2 minutes before responding, with proxy_read_timeout 60s, you get 504.

Timeout Strategies for Different Scenarios

API services: 30-60 seconds usually enough. APIs should respond fast—short timeout catches slow requests:

proxy_connect_timeout 5s;
proxy_read_timeout 30s;
proxy_send_timeout 30s;

File processing: Export reports, generate PDFs might take minutes. Relax timeout:

proxy_connect_timeout 10s;
proxy_read_timeout 300s;
proxy_send_timeout 300s;

Streaming services: Video, WebSocket, SSE—long connections, a day is normal:

proxy_read_timeout 86400s;

502/504 Troubleshooting in Practice

Root Cause Analysis

Cases I’ve encountered:

Backend actually crashed: process died, port occupied, OOM
Connection exhaustion: backend connection pool full, Nginx can’t connect
Timeout too short: like my 3 AM incident—proxy_read_timeout 60s, backend needed 2 minutes
Firewall/network issues: security group rules missing, iptables blocking

Log Diagnosis Method

First step: always check error_log:

error_log /var/log/nginx/error.log warn;

Common errors:

upstream timed out (110: Connection timed out) while reading response header from upstream

That’s 504, read timeout.

connect() failed (111: Connection refused) while connecting to upstream

That’s 502, connection refused—backend not listening.

Advanced technique: custom log format showing upstream status:

log_format upstream_status '$status $upstream_status $upstream_response_time';

access_log /var/log/nginx/access.log upstream_status;

You’ll see output like 200 200, 200, 502 0.5, 1.2, 3.0—clearly showing which backend returned what status, how long.

Typical Solutions

Scenario 1: Backend slow, frequent 504

Fix: increase proxy_read_timeout, verify backend can actually finish. Don’t just tune Nginx—backend timeout needs to match.

Scenario 2: Connection refused, 502

Fix: check if backend process runs, port listens, firewall allows.

netstat -tlnp | grep 8080
ps aux | grep your_app

Scenario 3: High concurrency, connection exhaustion

Fix: increase backend connection pool limit, or enable Nginx upstream keepalive to reduce connection creation overhead.

Performance Optimization Best Practices

Worker Configuration

Nginx is multi-process. worker_processes sets count, typically equals CPU cores:

worker_processes auto;

auto detects CPU cores. 8-core machine = 8 worker processes.

worker_connections is max connections per worker:

events {
    worker_connections 4096;
}

Theoretical max concurrent connections = worker_processes × worker_connections. 8 cores × 4096 = 32768. But actual value depends on system file descriptor limits.

TCP Optimization Trio

sendfile on;
tcp_nopush on;
tcp_nodelay on;

These three combined significantly boost performance:

sendfile on: kernel-level file transfer, bypass user-space buffers
tcp_nopush on: with sendfile, batch send packets instead of one-by-one
tcp_nodelay on: small packets sent immediately, don’t wait for buffer fill

I tested static file serving—enabling these three boosted throughput 30%+.

30%+

Throughput Boost

来源: 实测数据：启用 sendfile + tcp_nopush + tcp_nodelay

Other Optimizations

gzip compression: compress text responses, saves bandwidth:

gzip on;
gzip_types text/plain text/css application/json application/javascript;
gzip_min_length 1024;

File descriptor limits: high concurrency might exhaust. Check system limit:

ulimit -n

If only 1024, increase it. Edit /etc/security/limits.conf:

* soft nofile 65535
* hard nofile 65535

Complete Configuration Example

Production-ready config template:

# Basic config
worker_processes auto;

events {
    worker_connections 4096;
    multi_accept on;
}

http {
    # TCP optimization
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;

    # Keepalive
    keepalive_timeout 30;
    keepalive_requests 100;

    # Buffer config
    proxy_buffering on;
    proxy_buffer_size 4k;
    proxy_buffers 8 32k;
    proxy_busy_buffers_size 64k;

    # Timeout config
    proxy_connect_timeout 10s;
    proxy_read_timeout 60s;
    proxy_send_timeout 60s;

    # gzip
    gzip on;
    gzip_types text/plain text/css application/json;

    upstream backend {
        least_conn;
        server 192.168.1.10:8080 weight=3;
        server 192.168.1.11:8080 weight=2;
        server 192.168.1.12:8080 backup;
        keepalive 32;
    }

    server {
        listen 80;

        location / {
            proxy_pass http://backend;
            proxy_http_version 1.1;
            proxy_set_header Connection "";
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;

            proxy_next_upstream error timeout http_502 http_503 http_504;
            proxy_next_upstream_tries 3;
        }

        # SSE dedicated config
        location /events {
            proxy_pass http://backend;
            proxy_buffering off;
            proxy_read_timeout 86400s;
        }
    }
}

Summary

After all this, three key points:

Upstream config: pick right load balancing algorithm, enable keepalive pool, configure health checks
Buffer config: understand three parameters’ relationships, disable for special cases
Timeout config: understand what each parameter controls, use different strategies per scenario

That 3 AM incident taught me: Nginx config isn’t just filling in parameters. Each has design logic behind it—understanding principles prevents pitfalls.

If you’re new to Nginx, start with defaults, adjust when issues arise—don’t randomly write proxy_read_timeout 60s for production like I did. If you’ve already hit pitfalls, this article should help organize scattered experience into a system.

Finishing this article, I checked my current production config—keepalive 32, proxy_read_timeout 120s, least_conn load balancing. No more 3 AM alerts.

FAQ

Is proxy_read_timeout total timeout or interval between reads?

Interval between two read operations. If backend continuously sends data during processing (like heartbeats), even 5-minute total time works with proxy_read_timeout 60s. But if backend is silent for 5 minutes, must set 300s.

When should proxy_buffering be disabled?

Three scenarios must disable:

• Server-Sent Events (SSE): real-time push, buffering causes message delay
• WebSocket: bidirectional real-time, needs streaming
• Large file upload: avoid memory explosion, receive and forward simultaneously

What value should keepalive be set to?

Formula: keepalive ≈ Total QPS × Avg response time ÷ Worker count. Example: QPS 10000, response 50ms, 4 workers—keepalive around 125. Don't set too high—I once set 100 and crushed the backend.

What's the difference between 502 and 504?

502 Bad Gateway means Nginx can't connect to backend (service down, port blocked, firewall). 504 Gateway Timeout means connected but response timed out (backend slow). Diagnosis differs: 502 check process and port, 504 check timeout config and backend processing time.

Which load balancing algorithm to choose?

By scenario:

• Round-robin (default): stateless services, fair distribution
• least_conn: long connection scenarios (WebSocket, DB pools)
• ip_hash: session persistence needed (temporary fix, Redis is better)
• hash: distributed caching, improves hit rate

How to fix upstream sent too big header?

Increase proxy_buffer_size. Backend response headers too large (lots of cookies) exceed default 4k buffer. Changing to proxy_buffer_size 16k usually resolves it.

Why do sendfile + tcp_nopush + tcp_nodelay boost performance?

sendfile bypasses user-space for direct kernel transfer, tcp_nopush batches packets to reduce count, tcp_nodelay sends small data immediately without waiting. Combined, static file throughput measured 30%+ boost.

9 min read · Published on: Mar 30, 2026 · Modified on: Mar 30, 2026

Easton

Technology

Nginx Reverse Proxy Complete Guide: Upstream, Buffering, and Timeout

Upstream Load Balancing: More Than “Distributing Requests”

Let’s Start with Basic Syntax

Four Load Balancing Algorithms, Each for Different Scenarios

Weight Configuration: When Machines Have Different Specs

Keepalive Connection Pool: The Secret to Doubling Performance

Health Checks: Auto-Remove Dead Servers

Proxy Buffer: Helper or Trouble Maker

What Buffering Actually Does

Three Core Parameters, Understanding Their Relationships

Special Cases: Disable Buffering

Timeout: The Logic Behind Config Values

Three Timeout Parameters, Each with Its Own Job

Timeout vs 502/504 Relationship

Timeout Strategies for Different Scenarios

502/504 Troubleshooting in Practice

Root Cause Analysis

Log Diagnosis Method

Typical Solutions

Performance Optimization Best Practices

Worker Configuration

TCP Optimization Trio

Other Optimizations

Complete Configuration Example

Summary

FAQ

Comments

shadcn/ui and Radix: How to Maintain Accessibility When Customizing Components

shadcn/ui and Radix: How to Maintain Accessibility When Customizing Components

Tailwind Performance Optimization: JIT, Content Configuration, and Production Bundle Size Control

Tailwind Performance Optimization: JIT, Content Configuration, and Production Bundle Size Control

Dialog, Sheet, Popover: Accessibility and Focus Management for Overlay Components

Dialog, Sheet, Popover: Accessibility and Focus Management for Overlay Components

Upstream Load Balancing: More Than “Distributing Requests”

Let’s Start with Basic Syntax

Four Load Balancing Algorithms, Each for Different Scenarios

Weight Configuration: When Machines Have Different Specs

Keepalive Connection Pool: The Secret to Doubling Performance

Health Checks: Auto-Remove Dead Servers

Proxy Buffer: Helper or Trouble Maker

What Buffering Actually Does

Three Core Parameters, Understanding Their Relationships

Special Cases: Disable Buffering

Timeout: The Logic Behind Config Values

Three Timeout Parameters, Each with Its Own Job

Timeout vs 502/504 Relationship

Timeout Strategies for Different Scenarios

502/504 Troubleshooting in Practice

Root Cause Analysis

Log Diagnosis Method

Typical Solutions

Performance Optimization Best Practices

Worker Configuration

TCP Optimization Trio

Other Optimizations

Complete Configuration Example

Summary

FAQ

Comments

Related Posts

shadcn/ui and Radix: How to Maintain Accessibility When Customizing Components

shadcn/ui and Radix: How to Maintain Accessibility When Customizing Components

Tailwind Performance Optimization: JIT, Content Configuration, and Production Bundle Size Control

Tailwind Performance Optimization: JIT, Content Configuration, and Production Bundle Size Control

Dialog, Sheet, Popover: Accessibility and Focus Management for Overlay Components

Dialog, Sheet, Popover: Accessibility and Focus Management for Overlay Components