# Rate limit

## Introduction

* It is a technique used to control the rate or frequency of incoming requests or API calls to a system or service. It is implemented to prevent abuse, protect system resources, ensure fair usage, and maintain the overall stability and performance of the system.

## Technique

### Spike arrest

<figure><img src="/files/oHtCTX7nZTKtrtts2NaW" alt=""><figcaption></figcaption></figure>

* It helps you limit the sudden increase in the number of requests at any point in time. For instance, setting spike arrest policy rate to 10 per minute, it does the following calculations to limit the sudden spike/increase in the number of requests&#x20;

  ```
  10 per minute = 10 per 60 seconds = 1 per 6 seconds 
  ```

* It will not allow more than 1 request every 6 seconds. In this way, we can ensure that all 10 requests are not made within the initial seconds of a minute.

### Quota

<figure><img src="/files/2K99Jc2aYihahYzbsDjx" alt=""><figcaption></figcaption></figure>

* It helps you limit the number of requests per time interval. For instance, setting quota policy rate to 10 per minute, it is possible to hit all 10 requests in the first few seconds of a minute.

## Strategy

### Static Time Window

* In a static time window rate limit, a fixed time interval is defined, and the rate limit is applied within that interval. For example, let's consider a rate limit of 100 requests per minute. In a static time window approach, you would allow up to 100 requests to be made within every 1-minute interval. If a client exceeds this limit within that minute, they would be subject to rate limiting until the next minute starts.
* For a static time window approach, It is only needed to keep track of the number of requests made within each fixed time window.

### Sliding Time Window

<figure><img src="/files/FjGlmL91ahf8O7xE0SDf" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/mc3JhzvOQX0khTMz1ZAy" alt=""><figcaption></figcaption></figure>

* In a sliding time window rate limit, the rate limit is applied over a rolling or sliding time interval. Instead of fixed intervals, the rate limit is enforced over a continuous time window that moves with each request. For instance, let's assume a sliding time window rate limit of 100 requests per minute. In this approach, the system keeps track of the requests made within the last minute
* It can help to prevent from the burst during a period of time
* Each individual requests are tracked and stored in a list / queue

### Token bucket

* In the token bucket algorithm, a bucket is conceptualized as a container that holds a certain number of tokens. Tokens represent the units of capacity or permission to perform an action or make a request. The bucket is initially filled with a maximum number of tokens.
* Tokens are added to the bucket at a constant rate, known as the refill rate
* When a request or action is made, a certain number of tokens are required to perform that action. If there are enough tokens available in the bucket, the action is allowed, and the required number of tokens are consumed from the bucket. If there are not enough tokens available in the bucket, the action is rate-limited or delayed until enough tokens become available. The rate at which tokens are consumed from the bucket determines the rate at which actions can be performed or requests can be made.
* Different api can have different token consumption, so as to make it deliver the resources more efficiently

## Reference

{% embed url="<https://docs.sensedia.com/en/faqs/Latest/interceptors/spike_vs_rate-limit.html>" %}

{% embed url="<https://docs.apigee.com/api-platform/develop/comparing-quota-spike-arrest-and-concurrent-rate-limit-policies?hl=zh-tw>" %}

{% embed url="<https://yuanchieh.page/posts/2020/2020-10-18-%E4%BD%BF%E7%94%A8-redis-%E7%95%B6%E4%BD%9C-api-rate-limit-%E7%9A%84%E4%B8%89%E7%A8%AE%E6%96%B9%E6%B3%95/>" %}

{% embed url="<https://medium.com/@m-elbably/rate-limiting-the-sliding-window-algorithm-daa1d91e6196>" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://petercheng7788.gitbook.io/developer-note/backend/rate-limit.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
