Class ChaosAutoHaltMonitor
ServiceChaosRegistry.reset().
Only destructive fault types contribute to the window:
"error" (synthetic 5xx), "drop" (connection kill), and
"quota" (429/503). Benign fault types such as "latency",
"slow", "truncate", "malformed", and
"graphql" do not count — a latency-only experiment will never
auto-halt, which matches the circuit-breaker's purpose.
This prevents a chaos experiment from driving a cascading outage — the "steady-state guardrail" SREs expect.
The monitor is evaluated per chaos-fault injection (called from
Metrics.incrementHttpChaosInjected(String)).
It does not block the event loop — the sliding window is maintained in a
lock-free ConcurrentLinkedDeque of timestamps.
Configuration (all read dynamically from ConfigurationProperties):
chaosAutoHaltEnabled— master switch (default false = inert)chaosAutoHaltErrorThreshold— error count to trigger halt (default 50)chaosAutoHaltWindowMillis— sliding window (default 60 000 ms)
The singleton instance is shared process-wide, consistent with
ServiceChaosRegistry's singleton pattern.
-
Method Summary
Modifier and TypeMethodDescriptionintReturns the number of error timestamps currently in the sliding window.longReturns the total number of times the auto-halt circuit-breaker has triggered since the process started (or since the lastreset()).static ChaosAutoHaltMonitorvoidrecordError(String faultType) Record a chaos-injected fault and evaluate the circuit-breaker.voidreset()Reset the monitor state.
-
Method Details
-
getInstance
-
recordError
Record a chaos-injected fault and evaluate the circuit-breaker. Called after each chaos fault injection (fromMetrics.incrementHttpChaosInjected).Only destructive fault types (
"error","drop","quota") contribute to the sliding window. Benign faults ("latency","slow","truncate","malformed","graphql") are ignored — a latency-only experiment will never auto-halt.When the feature is disabled (
chaosAutoHaltEnabledis false), this method is a no-op — no timestamps are recorded, no evaluation occurs.- Parameters:
faultType- the fault type string (e.g. "error", "drop", "latency")
-
getHaltCount
public long getHaltCount()Returns the total number of times the auto-halt circuit-breaker has triggered since the process started (or since the lastreset()). -
currentWindowSize
public int currentWindowSize()Returns the number of error timestamps currently in the sliding window. -
reset
public void reset()Reset the monitor state. Called on server reset and for test isolation.
-