Interface SystemStoreHealthChecker

All Superinterfaces:
AutoCloseable
All Known Implementing Classes:
HeartbeatBasedSystemStoreHealthChecker

public interface SystemStoreHealthChecker extends AutoCloseable
Pluggable interface for checking the health of Venice system stores. The default implementation (HeartbeatBasedSystemStoreHealthChecker) uses the existing heartbeat write+read cycle. Alternative implementations (e.g., metrics-based) can be plugged in via the controller.parent.system.store.health.check.override.class.name config.

Custom implementations must provide a public constructor with the following signature:


 public MyHealthChecker(VeniceControllerMultiClusterConfig config)
 
so they can be instantiated reflectively by the controller.
  • Method Details

    • checkHealth

      Map<String,SystemStoreHealthChecker.HealthCheckResult> checkHealth(String clusterName, Set<String> systemStoreNames)
      Check the health of the given system stores in the specified cluster.
      Parameters:
      clusterName - the Venice cluster name
      systemStoreNames - the set of system store names to check
      Returns:
      a map from system store name to its health check result. Implementations should return an entry for every store they were able to check. Missing entries (e.g., when the checker aborts early due to leadership change or shutdown) are treated by the caller as "deferred to next round" — they are neither marked HEALTHY nor UNHEALTHY for this round, so a partial result will not inflate unhealthy counts. Implementations should therefore omit a store from the result map only when no decision was reached for it; an explicit UNHEALTHY entry should be returned for stores that were checked and found to be unhealthy.

      This method is invoked on the repair service's single-threaded scheduler, so a call that blocks indefinitely will stall every subsequent repair round. Implementations must bound their own execution time and honor thread interruption (the service calls shutdownNow() on shutdown) rather than relying on the caller to time them out.

    • close

      default void close()
      Specified by:
      close in interface AutoCloseable