Class HeartbeatMonitoringService

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable

    public class HeartbeatMonitoringService
    extends AbstractVeniceService
    This service monitors heartbeats. Heartbeats are only monitored if lagMonitors are added for leader or follower partitions. Once a lagMonitor is added, the service will being emitting a metric which grows linearly with time, only resetting to the timestamp of the last reported heartbeat for a given partition. Heartbeats are only monitored for stores which have a hybrid config. All other registrations for lag monitoring are ignored. Max and Average are reported per version of resource across partitions. If a heartbeat is invoked for a partition that we're NOT monitoring lag for, it is ignored. This class will monitor lag for a partition as a leader or follower, but never both. Whether we're reporting leader or follower depends on which monitor was set last. Lag will stop being reported for partitions which have the monitor removed. Each region gets a different lag monitor
    • Field Detail

      • DEFAULT_REPORTER_THREAD_SLEEP_INTERVAL_SECONDS

        public static final int DEFAULT_REPORTER_THREAD_SLEEP_INTERVAL_SECONDS
        See Also:
        Constant Field Values
      • DEFAULT_LAG_LOGGING_THREAD_SLEEP_INTERVAL_SECONDS

        public static final int DEFAULT_LAG_LOGGING_THREAD_SLEEP_INTERVAL_SECONDS
        See Also:
        Constant Field Values
      • DEFAULT_STALE_HEARTBEAT_LOG_THRESHOLD_MILLIS

        public static final long DEFAULT_STALE_HEARTBEAT_LOG_THRESHOLD_MILLIS
    • Constructor Detail

      • HeartbeatMonitoringService

        public HeartbeatMonitoringService​(io.tehuti.metrics.MetricsRepository metricsRepository,
                                          ReadOnlyStoreRepository metadataRepository,
                                          java.util.Set<java.lang.String> regionNames,
                                          java.lang.String localRegionName)
    • Method Detail

      • addFollowerLagMonitor

        public void addFollowerLagMonitor​(Version version,
                                          int partition)
        Adds monitoring for a follower partition of a given version. This request is ignored if the version isn't hybrid.
        Parameters:
        version - the version to monitor lag for
        partition - the partition to monitor lag for
      • addLeaderLagMonitor

        public void addLeaderLagMonitor​(Version version,
                                        int partition)
        Adds monitoring for a leader partition of a given version. This request is ignored if the version isn't hybrid.
        Parameters:
        version - the version to monitor lag for
        partition - the partition to monitor lag for
      • removeLagMonitor

        public void removeLagMonitor​(Version version,
                                     int partition)
        Removes monitoring for a partition of a given version.
        Parameters:
        version - the version to remove monitoring for
        partition - the partition to remove monitoring for
      • getHeartbeatInfo

        public java.util.Map<java.lang.String,​ReplicaHeartbeatInfo> getHeartbeatInfo​(java.lang.String versionTopicName,
                                                                                           int partitionFilter,
                                                                                           boolean filterLagReplica)
      • recordLeaderHeartbeat

        public void recordLeaderHeartbeat​(java.lang.String store,
                                          int version,
                                          int partition,
                                          java.lang.String region,
                                          java.lang.Long timestamp,
                                          boolean isReadyToServe)
        Record a leader heartbeat timestamp for a given partition of a store version from a specific region.
        Parameters:
        store - the store this heartbeat is for
        version - the version this heartbeat is for
        partition - the partition this heartbeat is for
        region - the region this heartbeat is from
        timestamp - the time of this heartbeat
        isReadyToServe - has this partition been marked ready to serve? This determines how the metric is reported
      • recordFollowerHeartbeat

        public void recordFollowerHeartbeat​(java.lang.String store,
                                            int version,
                                            int partition,
                                            java.lang.String region,
                                            java.lang.Long timestamp,
                                            boolean isReadyToServe)
        Record a follower heartbeat timestamp for a given partition of a store version from a specific region.
        Parameters:
        store - the store this heartbeat is for
        version - the version this heartbeat is for
        partition - the partition this heartbeat is for
        region - the region this heartbeat is from
        timestamp - the time of this heartbeat
        isReadyToServe - has this partition been marked ready to serve? This determines how the metric is reported
      • getLeaderHeartbeatTimeStamps

        protected java.util.Map<java.lang.String,​java.util.Map<java.lang.Integer,​java.util.Map<java.lang.Integer,​java.util.Map<java.lang.String,​HeartbeatTimeStampEntry>>>> getLeaderHeartbeatTimeStamps()
      • getFollowerHeartbeatTimeStamps

        protected java.util.Map<java.lang.String,​java.util.Map<java.lang.Integer,​java.util.Map<java.lang.Integer,​java.util.Map<java.lang.String,​HeartbeatTimeStampEntry>>>> getFollowerHeartbeatTimeStamps()
      • recordLags

        protected void recordLags​(java.util.Map<java.lang.String,​java.util.Map<java.lang.Integer,​java.util.Map<java.lang.Integer,​java.util.Map<java.lang.String,​HeartbeatTimeStampEntry>>>> heartbeatTimestamps,
                                  com.linkedin.davinci.stats.ingestion.heartbeat.HeartbeatMonitoringService.ReportLagFunction lagFunction)
      • record

        protected void record()
      • checkAndMaybeLogHeartbeatDelayMap

        protected void checkAndMaybeLogHeartbeatDelayMap​(java.util.Map<java.lang.String,​java.util.Map<java.lang.Integer,​java.util.Map<java.lang.Integer,​java.util.Map<java.lang.String,​HeartbeatTimeStampEntry>>>> heartbeatTimestamps)
      • checkAndMaybeLogHeartbeatDelay

        protected void checkAndMaybeLogHeartbeatDelay()