org.mockserver.llm.LlmRateLimitHeaders

public final class LlmRateLimitHeaders extends Object

Pure, deterministic helper that produces the provider-specific rate-limit HTTP headers real LLM providers send. Client SDKs (e.g. the OpenAI Python SDK, Anthropic SDK) read these headers to drive retry/backoff logic, so emitting them faithfully allows MockServer to exercise that logic against a mock.

The standard Retry-After header is intentionally not produced here — it is a generic HTTP header (not provider-specific) and is owned solely by HttpLlmResponseActionHandler.applyRateLimitHeaders(...), which emits it for every provider (including those with no provider-specific headers, such as Gemini, Bedrock, and Ollama). Keeping Retry-After in one place avoids a duplicate header on the wire.

Provider header reference

OPENAI / OPENAI_RESPONSES / AZURE_OPENAI (source: OpenAI docs "Rate limits" page) — x-ratelimit-limit-requests, x-ratelimit-remaining-requests, x-ratelimit-reset-requests (duration, e.g. "6s").
ANTHROPIC (source: Anthropic docs "Rate limits" page) — anthropic-ratelimit-requests-limit, anthropic-ratelimit-requests-remaining, anthropic-ratelimit-requests-reset (RFC 3339 timestamp).
GEMINI / BEDROCK — no provider-specific rate-limit headers; on a 429 only the standard Retry-After header (added by the handler) is exposed.
OLLAMA — none. Ollama is a local inference engine with no rate-limit concept.

All methods are static, deterministic, and pure (no clocks, no randomness inside — the caller passes resetSeconds and the current epoch second for RFC 3339 timestamps).

Method Summary

Modifier and Type

Method

Description

static Map<String,String>

headersFor(Provider provider, Integer requestLimit, Integer requestRemaining, Long resetSeconds, long nowEpochSecond, boolean limited)

Produce provider-specific rate-limit headers (excluding Retry-After, which the caller emits).

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Details
- headersFor
  
  public static Map<String,String> headersFor(Provider provider, Integer requestLimit, Integer requestRemaining, Long resetSeconds, long nowEpochSecond, boolean limited)
  
  Produce provider-specific rate-limit headers (excluding Retry-After, which the caller emits).
  
  Parameters:
  
  provider - the LLM provider
  
  requestLimit - quota limit (requests per window); may be null
  
  requestRemaining - requests remaining in the window; may be null
  
  resetSeconds - seconds until the window resets; may be null
  
  nowEpochSecond - current epoch second (for Anthropic RFC 3339 reset timestamp)
  
  limited - true when this is a rate-limit error (429); false for a successful response with quota info
  
  Returns:
  
  an insertion-ordered map of header-name to header-value; empty if the provider has no provider-specific rate-limit headers (Gemini, Bedrock, Ollama)

Class LlmRateLimitHeaders

Provider header reference

Method Summary

Methods inherited from class java.lang.Object

Method Details

headersFor