Class HeartbeatMonitoringService
java.lang.Object
com.linkedin.venice.service.AbstractVeniceService
com.linkedin.davinci.stats.ingestion.heartbeat.HeartbeatMonitoringService
- All Implemented Interfaces:
Closeable
,AutoCloseable
This service monitors heartbeats. Heartbeats are only monitored if lagMonitors are added for leader or follower
partitions. Once a lagMonitor is added, the service will being emitting a metric which grows linearly with time,
only resetting to the timestamp of the last reported heartbeat for a given partition.
Heartbeats are only monitored for stores which have a hybrid config. All other registrations for lag monitoring
are ignored.
Max and Average are reported per version of resource across partitions.
If a heartbeat is invoked for a partition that we're NOT monitoring lag for, it is ignored.
This class will monitor lag for a partition as a leader or follower, but never both. Whether we're reporting
leader or follower depends on which monitor was set last.
Lag will stop being reported for partitions which have the monitor removed.
Each region gets a different lag monitor
-
Nested Class Summary
Nested classes/interfaces inherited from class com.linkedin.venice.service.AbstractVeniceService
AbstractVeniceService.ServiceState
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
static final int
static final long
Fields inherited from class com.linkedin.venice.service.AbstractVeniceService
logger, serviceState
-
Constructor Summary
ConstructorDescriptionHeartbeatMonitoringService
(io.tehuti.metrics.MetricsRepository metricsRepository, ReadOnlyStoreRepository metadataRepository, Set<String> regionNames, String localRegionName) -
Method Summary
Modifier and TypeMethodDescriptionvoid
addFollowerLagMonitor
(Version version, int partition) Adds monitoring for a follower partition of a given version.void
addLeaderLagMonitor
(Version version, int partition) Adds monitoring for a leader partition of a given version.protected void
protected void
checkAndMaybeLogHeartbeatDelayMap
(Map<String, Map<Integer, Map<Integer, Map<String, HeartbeatTimeStampEntry>>>> heartbeatTimestamps) getHeartbeatInfo
(String versionTopicName, int partitionFilter, boolean filterLagReplica) protected void
record()
void
recordFollowerHeartbeat
(String store, int version, int partition, String region, Long timestamp, boolean isReadyToServe) Record a follower heartbeat timestamp for a given partition of a store version from a specific region.protected void
recordLags
(Map<String, Map<Integer, Map<Integer, Map<String, HeartbeatTimeStampEntry>>>> heartbeatTimestamps, com.linkedin.davinci.stats.ingestion.heartbeat.HeartbeatMonitoringService.ReportLagFunction lagFunction) void
recordLeaderHeartbeat
(String store, int version, int partition, String region, Long timestamp, boolean isReadyToServe) Record a leader heartbeat timestamp for a given partition of a store version from a specific region.void
removeLagMonitor
(Version version, int partition) Removes monitoring for a partition of a given version.boolean
void
-
Field Details
-
DEFAULT_REPORTER_THREAD_SLEEP_INTERVAL_SECONDS
public static final int DEFAULT_REPORTER_THREAD_SLEEP_INTERVAL_SECONDS- See Also:
-
DEFAULT_LAG_LOGGING_THREAD_SLEEP_INTERVAL_SECONDS
public static final int DEFAULT_LAG_LOGGING_THREAD_SLEEP_INTERVAL_SECONDS- See Also:
-
DEFAULT_STALE_HEARTBEAT_LOG_THRESHOLD_MILLIS
public static final long DEFAULT_STALE_HEARTBEAT_LOG_THRESHOLD_MILLIS
-
-
Constructor Details
-
HeartbeatMonitoringService
public HeartbeatMonitoringService(io.tehuti.metrics.MetricsRepository metricsRepository, ReadOnlyStoreRepository metadataRepository, Set<String> regionNames, String localRegionName)
-
-
Method Details
-
addFollowerLagMonitor
Adds monitoring for a follower partition of a given version. This request is ignored if the version isn't hybrid.- Parameters:
version
- the version to monitor lag forpartition
- the partition to monitor lag for
-
addLeaderLagMonitor
Adds monitoring for a leader partition of a given version. This request is ignored if the version isn't hybrid.- Parameters:
version
- the version to monitor lag forpartition
- the partition to monitor lag for
-
removeLagMonitor
Removes monitoring for a partition of a given version.- Parameters:
version
- the version to remove monitoring forpartition
- the partition to remove monitoring for
-
getHeartbeatInfo
public Map<String,ReplicaHeartbeatInfo> getHeartbeatInfo(String versionTopicName, int partitionFilter, boolean filterLagReplica) -
startInner
- Specified by:
startInner
in classAbstractVeniceService
- Returns:
- true if the service is completely started,
false if it is still starting asynchronously (in this case, it is the implementer's
responsibility to set
AbstractVeniceService.serviceState
toAbstractVeniceService.ServiceState.STARTED
upon completion of the async work). - Throws:
Exception
-
stopInner
- Specified by:
stopInner
in classAbstractVeniceService
- Throws:
Exception
-
recordLeaderHeartbeat
public void recordLeaderHeartbeat(String store, int version, int partition, String region, Long timestamp, boolean isReadyToServe) Record a leader heartbeat timestamp for a given partition of a store version from a specific region.- Parameters:
store
- the store this heartbeat is forversion
- the version this heartbeat is forpartition
- the partition this heartbeat is forregion
- the region this heartbeat is fromtimestamp
- the time of this heartbeatisReadyToServe
- has this partition been marked ready to serve? This determines how the metric is reported
-
recordFollowerHeartbeat
public void recordFollowerHeartbeat(String store, int version, int partition, String region, Long timestamp, boolean isReadyToServe) Record a follower heartbeat timestamp for a given partition of a store version from a specific region.- Parameters:
store
- the store this heartbeat is forversion
- the version this heartbeat is forpartition
- the partition this heartbeat is forregion
- the region this heartbeat is fromtimestamp
- the time of this heartbeatisReadyToServe
- has this partition been marked ready to serve? This determines how the metric is reported
-
getLeaderHeartbeatTimeStamps
-
getFollowerHeartbeatTimeStamps
-
recordLags
-
record
protected void record() -
checkAndMaybeLogHeartbeatDelayMap
-
checkAndMaybeLogHeartbeatDelay
protected void checkAndMaybeLogHeartbeatDelay() -
getMaxLeaderHeartbeatLag
-
getMaxFollowerHeartbeatLag
-