Interface SystemStoreHealthChecker
- All Superinterfaces:
AutoCloseable
- All Known Implementing Classes:
HeartbeatBasedSystemStoreHealthChecker
Pluggable interface for checking the health of Venice system stores.
The default implementation (
HeartbeatBasedSystemStoreHealthChecker) uses the existing heartbeat write+read
cycle. Alternative implementations (e.g., metrics-based) can be plugged in via the
controller.parent.system.store.health.check.override.class.name config.
Custom implementations must provide a public constructor with the following signature:
public MyHealthChecker(VeniceControllerMultiClusterConfig config)
so they can be instantiated reflectively by the controller.-
Nested Class Summary
Nested ClassesModifier and TypeInterfaceDescriptionstatic enumResult of a system store health check. -
Method Summary
Modifier and TypeMethodDescriptioncheckHealth(String clusterName, Set<String> systemStoreNames) Check the health of the given system stores in the specified cluster.default voidclose()
-
Method Details
-
checkHealth
Map<String,SystemStoreHealthChecker.HealthCheckResult> checkHealth(String clusterName, Set<String> systemStoreNames) Check the health of the given system stores in the specified cluster.- Parameters:
clusterName- the Venice cluster namesystemStoreNames- the set of system store names to check- Returns:
- a map from system store name to its health check result. Implementations should return an entry for
every store they were able to check. Missing entries (e.g., when the checker aborts early due to
leadership change or shutdown) are treated by the caller as "deferred to next round" — they are
neither marked HEALTHY nor UNHEALTHY for this round, so a partial result will not inflate unhealthy
counts. Implementations should therefore omit a store from the result map only when no decision was
reached for it; an explicit UNHEALTHY entry should be returned for stores that were checked and found
to be unhealthy.
This method is invoked on the repair service's single-threaded scheduler, so a call that blocks indefinitely will stall every subsequent repair round. Implementations must bound their own execution time and honor thread interruption (the service calls
shutdownNow()on shutdown) rather than relying on the caller to time them out.
-
close
default void close()- Specified by:
closein interfaceAutoCloseable
-