Class ServiceChaosRegistry
chaos
block on every forwarding expectation — the ergonomic "break service X" use case.
Resolution happens only on the matched-forward path (HttpActionHandler):
when a matched forward expectation carries no chaos of its own, the
registry is consulted with the request's Host header. A profile attached
to the expectation always takes precedence; the anonymous / unmatched proxy
fall-through path is intentionally left untouched.
Because a service-scoped profile has no single owning expectation, the
per-expectation count window (succeedFirst/failRequestCount),
outage window and degradation ramp (which need a first-match anchor) are not
meaningfully gated here — service-scoped profiles are intended for the
steady-state faults (error probability, connection drop, latency, body
corruption, slow response, and the host-independent quota). The probabilistic,
body, slow and quota faults all work as usual.
Time-to-live (auto-revert): a registration may carry an optional
ttlMillis. When set, the profile auto-expires that many milliseconds
after registration and is removed lazily on the next lookup — a "dead-man's
switch" so chaos started by an external orchestrator self-heals even if the
orchestrator never sends the matching clear (e.g. it crashed mid-experiment).
Expiry is measured with the controllable clock (TimeService), so it
tracks real wall-clock time by default but is deterministic under clock
freeze/advance for tests.
Hosts are matched case-insensitively and ignoring any :port suffix.
State is held in a ConcurrentHashMap and cleared on server reset
(see HttpState.reset()).
Fleet-awareness (G11): when a clustered StateBackend is
wired via setStateBackend(StateBackend), mutations (put/remove/reset/patch)
write-through to the backend's crudEntities("chaos-service") store,
and an InvalidationListener rebuilds the node-local map from the
backend on remote writes. The get(String) path remains purely
node-local for zero-overhead chaos lookups during request handling. When
no backend is set or the backend is not clustered, behaviour is identical
to the pre-G11 node-local-only registry.
-
Field Summary
FieldsModifier and TypeFieldDescriptionThe HTTP chaos fault types reported byactiveCountByFaultType(), matching thefault_typelabel values of themock_server_http_chaos_injectedcounter. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionFor each fault type, the number of currently-active (non-expired) service-scoped registrations whose profile includes that fault.entries()Returns a snapshot copy of the current, non-expired host → profile mappings.Returns the chaos profile registered for the given host, ornullif none (or it has expired — an expired entry is removed lazily here).static ServiceChaosRegistrypatch(String host, HttpChaosProfile partial) Applies JSON Merge Patch semantics to the chaos profile for the given host.voidput(String host, HttpChaosProfile profile) Register (or replace) the chaos profile for the given host with no expiry.voidput(String host, HttpChaosProfile profile, long ttlMillis) Register (or replace) the chaos profile for the given host, optionally with a time-to-live after which it auto-expires.voidRebuilds the node-local map from the backend store.voidRemoves the chaos profile for the given host (no-op if absent).voidreset()Clear all service-scoped chaos.voidsetStateBackend(StateBackend backend) Wires the clustered state backend for fleet-wide chaos replication.Returns, for each currently-active registration that carries a TTL, the remaining milliseconds until it auto-reverts.
-
Field Details
-
FAULT_TYPES
The HTTP chaos fault types reported byactiveCountByFaultType(), matching thefault_typelabel values of themock_server_http_chaos_injectedcounter.
-
-
Constructor Details
-
ServiceChaosRegistry
-
-
Method Details
-
getInstance
-
setStateBackend
Wires the clustered state backend for fleet-wide chaos replication. When the backendisClustered(), mutations are replicated via the backend's CRUD entity store, and anInvalidationListeneris registered to rebuild the node-local map on remote writes. When the backend is not clustered, this method is a no-op — the registry stays purely node-local. -
put
Register (or replace) the chaos profile for the given host with no expiry. No-op if either argument is null. -
put
Register (or replace) the chaos profile for the given host, optionally with a time-to-live after which it auto-expires.- Parameters:
ttlMillis- milliseconds until the profile auto-expires;<= 0means no expiry
-
get
Returns the chaos profile registered for the given host, ornullif none (or it has expired — an expired entry is removed lazily here). -
patch
Applies JSON Merge Patch semantics to the chaos profile for the given host. Only non-null fields frompartialare applied to the existing profile; unset fields in the partial are left unchanged. If no profile exists for the host, the partial IS registered as a new profile (with no TTL). No-op if either argument is null.- Returns:
- the updated profile, or null if host/partial is null
-
remove
Removes the chaos profile for the given host (no-op if absent). -
entries
Returns a snapshot copy of the current, non-expired host → profile mappings. -
ttlRemainingMillis
Returns, for each currently-active registration that carries a TTL, the remaining milliseconds until it auto-reverts. Entries with no TTL (or that have already expired) are omitted. Used to surface a remaining-TTL countdown onGET /mockserver/serviceChaos. -
activeCountByFaultType
For each fault type, the number of currently-active (non-expired) service-scoped registrations whose profile includes that fault. A profile carrying several faults (e.g. error + latency) is counted under each, so the per-type counts may sum to more than the number of registered hosts. Every fault type inFAULT_TYPESis always present in the returned map (0 when none), giving a stable, complete set of series for themock_server_active_service_chaosgauge so an operator can see — and alert on — which kinds of chaos are live, dropping to 0 as profiles are cleared or their TTLs lapse.Iterates the
ConcurrentHashMapweakly-consistently, so under concurrent registration/removal the counts may transiently reflect a mix of pre- and post-mutation state — acceptable for a gauge metric. -
reset
public void reset()Clear all service-scoped chaos. Called on server reset and for test isolation. -
reconcileFromBackend
public void reconcileFromBackend()Rebuilds the node-local map from the backend store. Called by theInvalidationListenerwhen a remote write is detected. Thread-safe: replaces the local map contents atomically relative to concurrent gets (ConcurrentHashMap iteration is weakly-consistent).
-