Skip to content
Vol. 1 · Ed. 2026
CyberGlossary
Entry № 904

Rate Limiting

What is Rate Limiting?

Rate LimitingRate limiting caps the number of requests an identifier (IP, user, API key, or token) may make over a time window, protecting APIs and apps from abuse, scraping, and brute-force.


Rate limiting is a traffic-shaping control that enforces a maximum request count per identifier per interval, returning 429 Too Many Requests or queuing once the budget is exhausted. Common algorithms include token bucket (burst-tolerant, rate-stable), leaky bucket (smooths bursts into constant output), fixed window (simple but spikes at boundaries), and sliding window (more accurate, slightly costlier). Modern API gateways, CDNs, and WAFs implement multi-key rate limits (per IP, per user, per token, per endpoint) and combine them with bot management and credential-stuffing defenses. Rate limiting is the cheapest first line of defense against scraping, enumeration, brute-force login, and accidental client retry storms.

Examples

  1. 01

    Limiting a public login API to 10 requests per minute per IP to slow credential stuffing.

  2. 02

    Token-bucket throttling on a search endpoint to absorb bursts but stop a scraper.

Frequently asked questions

What is Rate Limiting?

Rate limiting caps the number of requests an identifier (IP, user, API key, or token) may make over a time window, protecting APIs and apps from abuse, scraping, and brute-force. It belongs to the Network Security category of cybersecurity.

What does Rate Limiting mean?

Rate limiting caps the number of requests an identifier (IP, user, API key, or token) may make over a time window, protecting APIs and apps from abuse, scraping, and brute-force.

How does Rate Limiting work?

Rate limiting is a traffic-shaping control that enforces a maximum request count per identifier per interval, returning 429 Too Many Requests or queuing once the budget is exhausted. Common algorithms include token bucket (burst-tolerant, rate-stable), leaky bucket (smooths bursts into constant output), fixed window (simple but spikes at boundaries), and sliding window (more accurate, slightly costlier). Modern API gateways, CDNs, and WAFs implement multi-key rate limits (per IP, per user, per token, per endpoint) and combine them with bot management and credential-stuffing defenses. Rate limiting is the cheapest first line of defense against scraping, enumeration, brute-force login, and accidental client retry storms.

How do you defend against Rate Limiting?

Defences for Rate Limiting typically combine technical controls and operational practices, as detailed in the full definition above.

What are other names for Rate Limiting?

Common alternative names include: Throttling, API rate limit.

Related terms

See also