Class LlmQuotaRegistry

java.lang.Object
org.mockserver.llm.LlmQuotaRegistry

public class LlmQuotaRegistry extends Object
Process-wide, stateful request quota for LLM responses — a fixed-window rate limiter. Unlike the probabilistic 429 in LlmChaosProfile, this is deterministic and stateful: it counts how many requests have hit a named quota within the current time window and reports when the limit is exceeded, so a test can drive an agent into a hard rate-limit (e.g. "the 4th call in 60s gets 429").

Quotas are keyed by name, so several expectations that share a quotaName share one counter (model an upstream account limit), while distinct names are independent. State is held in a ConcurrentHashMap and each acquire is an atomic per-key update, safe under concurrent requests.

The time source is injectable so window behaviour is unit-testable without sleeping; production uses System.currentTimeMillis().

  • Constructor Details

    • LlmQuotaRegistry

      public LlmQuotaRegistry(LongSupplier clock)
  • Method Details

    • getInstance

      public static LlmQuotaRegistry getInstance()
    • tryAcquire

      public boolean tryAcquire(String name, int limit, long windowMillis)
      Record one request against the named quota and report whether it is allowed.

      Fixed-window semantics: the first request in a window starts it; the window expires windowMillis after it started, after which the next request starts a fresh window. A request is allowed when the in-window count (including itself) is at or below limit.

      Returns:
      true if the request is within the quota, false if it exceeds the limit for the current window.
    • reset

      public void reset()
      Clear all quota state. Called on server reset and for test isolation.