Skip to content

Commit

Permalink
docs(response_headers.md): add response headers to docs
Browse files Browse the repository at this point in the history
  • Loading branch information
krrishdholakia committed Sep 29, 2024
1 parent bfa9553 commit 7630680
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 1 deletion.
2 changes: 1 addition & 1 deletion docs/my-website/docs/proxy/reliability.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import Image from '@theme/IdealImage';
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

# 🔥 Load Balancing, Fallbacks, Retries, Timeouts
# Fallbacks, Load Balancing, Retries

- Quick Start [load balancing](#test---load-balancing)
- Quick Start [client side fallbacks](#test---client-side-fallbacks)
Expand Down
24 changes: 24 additions & 0 deletions docs/my-website/docs/proxy/response_headers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Rate Limit Headers

When you make a request to the proxy, the proxy will return the following [OpenAI-compatible headers](https://platform.openai.com/docs/guides/rate-limits/rate-limits-in-headers):

- `x-ratelimit-remaining-requests` - Optional[int]: The remaining number of requests that are permitted before exhausting the rate limit.
- `x-ratelimit-remaining-tokens` - Optional[int]: The remaining number of tokens that are permitted before exhausting the rate limit.
- `x-ratelimit-limit-requests` - Optional[int]: The maximum number of requests that are permitted before exhausting the rate limit.
- `x-ratelimit-limit-tokens` - Optional[int]: The maximum number of tokens that are permitted before exhausting the rate limit.
- `x-ratelimit-reset-requests` - Optional[int]: The time at which the rate limit will reset.
- `x-ratelimit-reset-tokens` - Optional[int]: The time at which the rate limit will reset.

These headers are useful for clients to understand the current rate limit status and adjust their request rate accordingly.

## How are these headers calculated?

**If key has rate limits set**

The proxy will return the [remaining rate limits for that key](https://github.com/BerriAI/litellm/blob/bfa95538190575f7f317db2d9598fc9a82275492/litellm/proxy/hooks/parallel_request_limiter.py#L778).

**If key does not have rate limits set**

The proxy returns the remaining requests/tokens returned by the backend provider.

If the backend provider does not return these headers, the value will be `None`.
1 change: 1 addition & 0 deletions docs/my-website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ const sidebars = {
"proxy/enterprise",
"proxy/user_keys",
"proxy/configs",
"proxy/response_headers",
"proxy/reliability",
{
type: "category",
Expand Down

0 comments on commit 7630680

Please sign in to comment.