Package com.linkedin.davinci.storage
Class DiskHealthCheckService
java.lang.Object
com.linkedin.venice.service.AbstractVeniceService
com.linkedin.davinci.storage.DiskHealthCheckService
- All Implemented Interfaces:
Closeable,AutoCloseable
DiskHealthCheckService will wake up every 10 seconds by default and run a health check
in the disk by writing 64KB random data, read them back and verify the content; if there
is any error within the process, an in-memory state variable "diskHealthy" will be updated
to false; otherwise, "diskHealthy" will be kept as true.
If there is a SSD failure, the disk operation could hang forever; in order to report such
kind of disk failure, there is a timeout mechanism inside the health status polling API;
a total timeout will be decided at the beginning:
totalTimeout = Math.max(30 seconds, health check interval + disk operation timeout)
we will keep track of the last update time for the in-memory health status variable, if
the in-memory status haven't been updated for more than the totalTimeout, we believe the
disk operation hang due to disk failure and start reporting unhealthy for this server.
-
Nested Class Summary
Nested classes/interfaces inherited from class com.linkedin.venice.service.AbstractVeniceService
AbstractVeniceService.ServiceState -
Field Summary
Fields inherited from class com.linkedin.venice.service.AbstractVeniceService
logger, serviceState -
Constructor Summary
ConstructorsConstructorDescriptionDiskHealthCheckService(boolean serviceEnabled, long healthCheckIntervalMs, long diskOperationTimeoutMs, String databasePath, long diskFailServerShutdownTimeMs) -
Method Summary
Modifier and TypeMethodDescriptionbooleanbooleanvoid
-
Constructor Details
-
DiskHealthCheckService
public DiskHealthCheckService(boolean serviceEnabled, long healthCheckIntervalMs, long diskOperationTimeoutMs, String databasePath, long diskFailServerShutdownTimeMs)
-
-
Method Details
-
startInner
public boolean startInner()- Specified by:
startInnerin classAbstractVeniceService- Returns:
- true if the service is completely started,
false if it is still starting asynchronously (in this case, it is the implementer's
responsibility to set
AbstractVeniceService.serviceStatetoAbstractVeniceService.ServiceState.STARTEDupon completion of the async work).
-
isDiskHealthy
public boolean isDiskHealthy() -
getErrorMessage
-
stopInner
public void stopInner()- Specified by:
stopInnerin classAbstractVeniceService
-