HTTP 429 Too Many Requests

You have sent too many requests in a given time period and the server is rate limiting your client. The server may include a Retry-After header telling you how many seconds to wait before trying again.

What Is HTTP 429 Too Many Requests?

HTTP 429 Too Many Requests is a client error status code defined in RFC 6585 that indicates the user has sent too many requests in a given amount of time. Also known as "rate limiting," this mechanism protects servers from being overwhelmed by excessive traffic, whether from legitimate clients making too many API calls or from malicious actors attempting denial-of-service attacks. The response should include a Retry-After header indicating how long the client should wait before making another request.

Rate limiting is a fundamental component of modern API infrastructure. Nearly every public API implements rate limits to ensure fair resource distribution among all clients. Common rate limit strategies include fixed-window counters (100 requests per minute), sliding-window counters (more precise but computationally expensive), token bucket algorithms (allow bursts up to a limit), and leaky bucket algorithms (enforce a steady request rate). Understanding these strategies helps developers design clients that work efficiently within rate limits.

When building rate-limited APIs, it is best practice to include rate limit metadata in response headers. The X-RateLimit-Limit header shows the maximum number of requests allowed, X-RateLimit-Remaining shows how many requests are left in the current window, and X-RateLimit-Reset indicates when the window resets. These headers allow clients to proactively throttle their requests before hitting the limit, avoiding 429 errors entirely.

Common Causes

API Rate Limit Exceeded

Your application has exceeded the API provider's rate limit, which caps how many requests you can make per minute, hour, or day. Common limits range from 60 to 10,000 requests per minute depending on the plan.

Aggressive Polling

Your client is polling an endpoint too frequently, such as checking for updates every second when the API expects polling no more than once per minute.

Burst Traffic from Loops

A loop or batch operation in your code sends many requests simultaneously without any delay between them. The server detects the burst and starts rejecting requests.

Shared Rate Limit

Multiple applications or users share the same API key or IP address, and their combined traffic exceeds the rate limit, even though each individual client is within its own budget.

How to Fix

Respect Retry-After Header

Check the Retry-After response header for the number of seconds or a date when you can retry. Wait at least that long before sending another request to avoid being rate limited again.

Implement Exponential Backoff

When retrying after a 429, use exponential backoff with jitter. Wait 1 second, then 2, then 4, then 8 seconds, adding random jitter to prevent all clients from retrying at the same time.

Add Request Throttling

Limit the rate at which your client sends requests. Use a token bucket or leaky bucket algorithm to spread requests evenly over time rather than sending them in bursts.

Cache API Responses

Cache responses that do not change frequently to avoid making unnecessary API calls. Use the ETag and Cache-Control headers to determine when cached data is still valid.

Code Examples

Express.js

// Express.js — rate limiting middleware
const express = require('express');
const app = express();

// Simple in-memory rate limiter
const rateLimits = new Map();

function rateLimit({ windowMs = 60000, max = 100 } = {}) {
  return (req, res, next) => {
    const key = req.ip;
    const now = Date.now();
    const windowStart = now - windowMs;

    if (!rateLimits.has(key)) {
      rateLimits.set(key, []);
    }

    const timestamps = rateLimits.get(key).filter(t => t > windowStart);
    rateLimits.set(key, timestamps);

    if (timestamps.length >= max) {
      const retryAfter = Math.ceil((timestamps[0] + windowMs - now) / 1000);
      res.set('Retry-After', String(retryAfter));
      res.set('X-RateLimit-Limit', String(max));
      res.set('X-RateLimit-Remaining', '0');
      return res.status(429).json({
        error: 'Too Many Requests',
        message: `Rate limit exceeded. Try again in ${retryAfter} seconds`,
        retryAfter
      });
    }

    timestamps.push(now);
    res.set('X-RateLimit-Limit', String(max));
    res.set('X-RateLimit-Remaining', String(max - timestamps.length));
    next();
  };
}

// Apply rate limiting: 100 requests per minute
app.use('/api/', rateLimit({ windowMs: 60000, max: 100 }));

Flask (Python)

# Flask — rate limiting with flask-limiter
from flask import Flask, jsonify
from flask_limiter import Limiter
from flask_limiter.util import get_remote_address

app = Flask(__name__)
limiter = Limiter(
    get_remote_address,
    app=app,
    default_limits=['100 per minute'],
    storage_uri='memory://'
)

@app.route('/api/data', methods=['GET'])
@limiter.limit('30 per minute')
def get_data():
    return jsonify(data='your response here')

@app.errorhandler(429)
def rate_limit_exceeded(e):
    return jsonify(
        error='Too Many Requests',
        message='Rate limit exceeded. Please slow down.',
        retry_after=e.description
    ), 429

# Manual rate limiting without flask-limiter
import time
from collections import defaultdict

request_counts = defaultdict(list)

def check_rate_limit(ip, max_requests=100, window=60):
    now = time.time()
    request_counts[ip] = [t for t in request_counts[ip] if t > now - window]
    if len(request_counts[ip]) >= max_requests:
        retry_after = int(request_counts[ip][0] + window - now)
        return False, retry_after
    request_counts[ip].append(now)
    return True, 0

Frequently Asked Questions

What causes a 429 Too Many Requests error?

Your client is sending requests faster than the API allows. Each API has a rate limit, typically measured in requests per minute or requests per hour. When you exceed this limit, the server responds with 429 until the rate limit window resets.

How should I handle a 429 response?

Check the Retry-After header for the wait time, then implement exponential backoff with jitter for retries. Do not retry immediately, as this will likely result in another 429 and may lead to your API access being suspended.

What is exponential backoff?

Exponential backoff is a retry strategy where you double the wait time after each failed attempt: 1s, 2s, 4s, 8s, and so on. Adding random jitter (a small random delay) prevents multiple clients from retrying simultaneously.

Can rate limiting protect against DDoS attacks?

Rate limiting is one layer of DDoS protection that helps prevent individual sources from overwhelming the server. However, distributed attacks from many sources may require additional defenses like CDN protection, IP blocking, and traffic analysis.

How do I know my rate limit before hitting 429?

Check the response headers for X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset. These headers are included in successful responses and tell you your current rate limit status so you can throttle proactively.

Monitor Your APIs & Services

Get instant alerts when your endpoints go down. 60-second checks, free forever.

Start Monitoring Free →