Package org.mockserver.llm
Class LlmRateLimitHeaders
java.lang.Object
org.mockserver.llm.LlmRateLimitHeaders
Pure, deterministic helper that produces the provider-specific
rate-limit HTTP headers real LLM providers send. Client SDKs (e.g. the OpenAI
Python SDK, Anthropic SDK) read these headers to drive retry/backoff logic, so
emitting them faithfully allows MockServer to exercise that logic against a mock.
The standard Retry-After header is intentionally not produced
here — it is a generic HTTP header (not provider-specific) and is owned solely by
HttpLlmResponseActionHandler.applyRateLimitHeaders(...), which emits it
for every provider (including those with no provider-specific headers, such as
Gemini, Bedrock, and Ollama). Keeping Retry-After in one place avoids a
duplicate header on the wire.
Provider header reference
- OPENAI / OPENAI_RESPONSES / AZURE_OPENAI (source: OpenAI docs
"Rate limits" page) —
x-ratelimit-limit-requests,x-ratelimit-remaining-requests,x-ratelimit-reset-requests(duration, e.g. "6s"). - ANTHROPIC (source: Anthropic docs "Rate limits" page) —
anthropic-ratelimit-requests-limit,anthropic-ratelimit-requests-remaining,anthropic-ratelimit-requests-reset(RFC 3339 timestamp). - GEMINI / BEDROCK — no provider-specific rate-limit headers;
on a 429 only the standard
Retry-Afterheader (added by the handler) is exposed. - OLLAMA — none. Ollama is a local inference engine with no rate-limit concept.
All methods are static, deterministic, and pure (no clocks, no randomness
inside — the caller passes resetSeconds and the current epoch second
for RFC 3339 timestamps).
-
Method Summary
-
Method Details
-
headersFor
public static Map<String,String> headersFor(Provider provider, Integer requestLimit, Integer requestRemaining, Long resetSeconds, long nowEpochSecond, boolean limited) Produce provider-specific rate-limit headers (excludingRetry-After, which the caller emits).- Parameters:
provider- the LLM providerrequestLimit- quota limit (requests per window); may benullrequestRemaining- requests remaining in the window; may benullresetSeconds- seconds until the window resets; may benullnowEpochSecond- current epoch second (for Anthropic RFC 3339 reset timestamp)limited-truewhen this is a rate-limit error (429);falsefor a successful response with quota info- Returns:
- an insertion-ordered map of header-name to header-value; empty if the provider has no provider-specific rate-limit headers (Gemini, Bedrock, Ollama)
-