Class InstanceHealthMonitor
java.lang.Object
com.linkedin.venice.fastclient.meta.InstanceHealthMonitor
- All Implemented Interfaces:
Closeable
,AutoCloseable
The class is used to measure the healthiness about the cluster the store belongs to.
So far, it is per store because of the following reasons:
1. Simplify the logic in this class since we don't need to maintain the per-store status, such as quota related responses.
2. Isolate the healthiness decision among different stores to reduce the impact of false signal.
There are concerns with this approach as well, for example, the mis-behaving instances will take a longer time to be discovered
in each store.
This class is using the pending requests + response status of each Route to decide the healthiness.
1. For the good response, the pending request counter will be reset when receiving the response.
2. For the error response, the pending request counter reset will be delayed, which is a way to downgrade the instance.
3. When the pending request counter exceeds the pre-defined threshold, the instance will be completely blocked.
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
int
int
getPendingRequestCounter
(String instance) int
boolean
isInstanceBlocked
(String instance) If an instance is blocked, it won't be considered for new requests until the requests are closed either in a proper manner or closed by {@link #trackHealthBasedOnRequestToInstance#timeoutFuture}boolean
isInstanceHealthy
(String instance) If an instance is marked unhealthy, this instances will be retried again continuously to know if that instance comes back up and start serving requests.trackHealthBasedOnRequestToInstance
(String instance) trackHealthBasedOnRequestToInstance
(String instance, CompletableFuture<TransportClientResponse> transportFuture) This function tracks the health of an Instance based on the request sent to that Instance: by returning an incomplete completable future forAbstractStoreMetadata
which 1.
-
Constructor Details
-
InstanceHealthMonitor
-
-
Method Details
-
getTimeoutProcessor
-
trackHealthBasedOnRequestToInstance
public ChainedCompletableFuture<Integer,Integer> trackHealthBasedOnRequestToInstance(String instance) -
trackHealthBasedOnRequestToInstance
public ChainedCompletableFuture<Integer,Integer> trackHealthBasedOnRequestToInstance(String instance, CompletableFuture<TransportClientResponse> transportFuture) This function tracks the health of an Instance based on the request sent to that Instance: by returning an incomplete completable future forAbstractStoreMetadata
which 1. incrementspendingRequestCounterMap
for each server instances per store. This is done in this function which is called before starting a get() request. 2. whenComplete() of this completable future decrements the above counters once the response for the get() request is received. Using this we can track the number of pending requests for each server instance. -
isInstanceHealthy
If an instance is marked unhealthy, this instances will be retried again continuously to know if that instance comes back up and start serving requests. Note that these instances will eventually become blocked when it reaches the threshold for pendingRequestCounter. This provides some break between continuously sending requests to these instances. -
isInstanceBlocked
If an instance is blocked, it won't be considered for new requests until the requests are closed either in a proper manner or closed by {@link #trackHealthBasedOnRequestToInstance#timeoutFuture} -
getBlockedInstanceCount
public int getBlockedInstanceCount() -
getUnhealthyInstanceCount
public int getUnhealthyInstanceCount() -
getPendingRequestCounter
-
close
- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Throws:
IOException
-