Class VeniceChangelogConsumerDaVinciRecordTransformerImpl<K,V>

java.lang.Object
com.linkedin.davinci.consumer.VeniceChangelogConsumerDaVinciRecordTransformerImpl<K,V>
All Implemented Interfaces:
StatefulVeniceChangelogConsumer<K,V>, VeniceChangelogConsumer<K,V>, AutoCloseable

public class VeniceChangelogConsumerDaVinciRecordTransformerImpl<K,V> extends Object implements StatefulVeniceChangelogConsumer<K,V>, VeniceChangelogConsumer<K,V>
  • Constructor Details

  • Method Details

    • start

      public CompletableFuture<Void> start(Set<Integer> partitions)
      Description copied from interface: StatefulVeniceChangelogConsumer
      Starts the consumer by subscribing to the specified partitions. On restart, the client automatically resumes from the last checkpoint. On fresh start, it begins from the beginning of the topic or leverages blob transfer if available. The returned future completes when there is at least one message available to be polled. Use StatefulVeniceChangelogConsumer.isCaughtUp() to determine when all subscribed partitions have caught up to the latest offset.
      Specified by:
      start in interface StatefulVeniceChangelogConsumer<K,V>
      Parameters:
      partitions - Set of partition IDs to subscribe to. Pass empty set to subscribe to all partitions.
      Returns:
      A future that completes when at least one message is available to poll.
    • start

      public CompletableFuture<Void> start()
      Description copied from interface: StatefulVeniceChangelogConsumer
      Subscribes to every partition for the Venice store. See StatefulVeniceChangelogConsumer.start(Set) for more information.
      Specified by:
      start in interface StatefulVeniceChangelogConsumer<K,V>
    • stop

      public void stop() throws Exception
      Specified by:
      stop in interface StatefulVeniceChangelogConsumer<K,V>
      Throws:
      Exception
    • getPartitionCount

      public int getPartitionCount()
      Specified by:
      getPartitionCount in interface VeniceChangelogConsumer<K,V>
      Returns:
      total number of store partitions
    • subscribe

      public CompletableFuture<Void> subscribe(Set<Integer> partitions)
      Description copied from interface: VeniceChangelogConsumer
      Subscribe a set of partitions for a store to this VeniceChangelogConsumer, with the earliest position in the topic for each partition. The VeniceChangelogConsumer should try to consume messages from all partitions that are subscribed to it.
      Specified by:
      subscribe in interface VeniceChangelogConsumer<K,V>
      Parameters:
      partitions - the set of partition to subscribe and consume
      Returns:
      a future which completes when data from the partitions are ready to be consumed
    • subscribeAll

      public CompletableFuture<Void> subscribeAll()
      Description copied from interface: VeniceChangelogConsumer
      Subscribe all partitions belonging to a specific store, with the earliest position in the topic for each partition.
      Specified by:
      subscribeAll in interface VeniceChangelogConsumer<K,V>
      Returns:
      a future which completes when all partitions are ready to be consumed data
    • unsubscribe

      public void unsubscribe(Set<Integer> partitions)
      Description copied from interface: VeniceChangelogConsumer
      Stop ingesting messages from a set of partitions for a specific store.
      Specified by:
      unsubscribe in interface VeniceChangelogConsumer<K,V>
      Parameters:
      partitions - The set of topic partitions to unsubscribe
    • unsubscribeAll

      public void unsubscribeAll()
      Description copied from interface: VeniceChangelogConsumer
      Stop ingesting messages from all partitions.
      Specified by:
      unsubscribeAll in interface VeniceChangelogConsumer<K,V>
    • seekToBeginningOfPush

      public CompletableFuture<Void> seekToBeginningOfPush(Set<Integer> partitions)
      Description copied from interface: VeniceChangelogConsumer
      Seek to the beginning of the push for a set of partitions. This is analogous to doing a bootstrap of data for the consumer. This seek will ONLY seek to the beginning of the version which is currently serving data, and the consumer will switch to reading data from a new version (should one get created) once it has read up to the point in the change capture stream that indicates the version swap (which can only occur after consuming all the data in the last push). This instructs the consumer to consume data from the batch push.
      Specified by:
      seekToBeginningOfPush in interface VeniceChangelogConsumer<K,V>
      Parameters:
      partitions - the set of partitions to seek with
      Returns:
      a future which completes when the operation has succeeded for all partitions.
    • seekToBeginningOfPush

      public CompletableFuture<Void> seekToBeginningOfPush()
      Description copied from interface: VeniceChangelogConsumer
      Seek to the beginning of the push for subscribed partitions. See VeniceChangelogConsumer.seekToBeginningOfPush(Set) for more information.
      Specified by:
      seekToBeginningOfPush in interface VeniceChangelogConsumer<K,V>
      Returns:
      a future which completes when the partitions are ready to be consumed data
    • seekToEndOfPush

      public CompletableFuture<Void> seekToEndOfPush(Set<Integer> partitions)
      Description copied from interface: VeniceChangelogConsumer
      Seek to the end of the last push for a given set of partitions. This instructs the consumer to begin consuming events which are transmitted to Venice following the last batch push.
      Specified by:
      seekToEndOfPush in interface VeniceChangelogConsumer<K,V>
      Parameters:
      partitions - the set of partitions to seek with
      Returns:
      a future which completes when the operation has succeeded for all partitions.
    • seekToEndOfPush

      public CompletableFuture<Void> seekToEndOfPush()
      Description copied from interface: VeniceChangelogConsumer
      Seek to the end of the push for all subscribed partitions. See VeniceChangelogConsumer.seekToEndOfPush(Set) for more information.
      Specified by:
      seekToEndOfPush in interface VeniceChangelogConsumer<K,V>
      Returns:
      a future which completes when the operation has succeeded for all partitions.
    • seekToTail

      public CompletableFuture<Void> seekToTail(Set<Integer> partitions)
      Description copied from interface: VeniceChangelogConsumer
      Seek to the end of events which have been transmitted to Venice and start consuming new events. This will ONLY consume events transmitted via nearline and incremental push. It will not read batch push data.
      Specified by:
      seekToTail in interface VeniceChangelogConsumer<K,V>
      Parameters:
      partitions - the set of partitions to seek with
      Returns:
      a future which completes when the operation has succeeded for all partitions.
    • seekToTail

      public CompletableFuture<Void> seekToTail()
      Description copied from interface: VeniceChangelogConsumer
      Seek to the end of events which have been transmitted to Venice for all subscribed partitions. See VeniceChangelogConsumer.seekToTail(Set) for more information.
      Specified by:
      seekToTail in interface VeniceChangelogConsumer<K,V>
      Returns:
      a future which completes when the operation has succeeded for all partitions.
    • seekToCheckpoint

      public CompletableFuture<Void> seekToCheckpoint(Set<VeniceChangeCoordinate> checkpoints)
      Description copied from interface: VeniceChangelogConsumer
      Seek the provided checkpoints for the specified partitions. Note about checkpoints: Checkpoints have the following properties and should be considered: - Checkpoints are NOT comparable or valid across partitions. - Checkpoints are NOT comparable or valid across regions - Checkpoints are NOT comparable across store versions - It is not possible to determine the number of events between two checkpoints - It is possible that a checkpoint is no longer on retention. In such case, we will return an exception to the caller.
      Specified by:
      seekToCheckpoint in interface VeniceChangelogConsumer<K,V>
      Returns:
      a future which completes when seek has completed for all partitions
    • seekToTimestamps

      public CompletableFuture<Void> seekToTimestamps(Map<Integer,Long> timestamps)
      Description copied from interface: VeniceChangelogConsumer
      Seek to the provided timestamps for the specified partitions based on wall clock time for when this message was processed by Venice and produced to change capture. Note, this API can only be used to seek on nearline data applied to the current serving version in Venice. This will not seek on data transmitted via Batch Push. If the provided timestamp is lower than the earliest timestamp on a given stream, the earliest event will be returned. THIS WILL NOT SEEK TO DATA WHICH WAS APPLIED ON A PREVIOUS VERSION. You should never seek back in time to a timestamp which is smaller than the current time - rewindTimeInSeconds configured in the hybrid settings for this Venice store. The timestamp passed to this function should be associated to timestamps processed by this interface. The timestamp returned by PubSubMessage.getPubSubMessageTime() refers to the time when Venice processed the event, and calls to this method will seek based on that sequence of events. Note: it bears no relation to timestamps provided by upstream producers when writing to Venice where a user may optionally provide a timestamp at time of producing a record.
      Specified by:
      seekToTimestamps in interface VeniceChangelogConsumer<K,V>
      Parameters:
      timestamps - a map keyed by a partition ID, and the timestamp checkpoints to seek for each partition.
      Returns:
    • seekToTimestamp

      public CompletableFuture<Void> seekToTimestamp(Long timestamp)
      Description copied from interface: VeniceChangelogConsumer
      Seek to the specified timestamp for all subscribed partitions. See VeniceChangelogConsumer.seekToTimestamps(Map) for more information.
      Specified by:
      seekToTimestamp in interface VeniceChangelogConsumer<K,V>
      Returns:
      a future which completes when the operation has succeeded for all partitions.
    • pause

      public void pause(Set<Integer> partitions)
      Description copied from interface: VeniceChangelogConsumer
      Pause the client on all or subset of partitions this client is subscribed to. Calls to VeniceChangelogConsumer.poll(long) will not return results from paused partitions until VeniceChangelogConsumer.resume(Set) is called again later for those partitions.
      Specified by:
      pause in interface VeniceChangelogConsumer<K,V>
    • pause

      public void pause()
      Description copied from interface: VeniceChangelogConsumer
      Pause the client on all subscriptions. See VeniceChangelogConsumer.pause(Set) for more information.
      Specified by:
      pause in interface VeniceChangelogConsumer<K,V>
    • resume

      public void resume(Set<Integer> partitions)
      Description copied from interface: VeniceChangelogConsumer
      Resume the client on all or a subset of partitions this client is subscribed to and has paused.
      Specified by:
      resume in interface VeniceChangelogConsumer<K,V>
    • resume

      public void resume()
      Description copied from interface: VeniceChangelogConsumer
      Pause the client on all subscriptions. See VeniceChangelogConsumer.resume(Set) for more information.
      Specified by:
      resume in interface VeniceChangelogConsumer<K,V>
    • close

      public void close()
      Description copied from interface: VeniceChangelogConsumer
      Release the internal resources.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface VeniceChangelogConsumer<K,V>
    • poll

      public Collection<PubSubMessage<K,ChangeEvent<V>,VeniceChangeCoordinate>> poll(long timeoutInMs)
      Description copied from interface: StatefulVeniceChangelogConsumer
      Polls for the next batch of change events. The first records returned after calling StatefulVeniceChangelogConsumer.start(Set) will be from the local compacted state. Once the local state is fully consumed, subsequent calls will return real-time updates made to the Venice store. Records are returned in batches up to the configured MAX_BUFFER_SIZE. This method will return immediately if: 1. The buffer reaches MAX_BUFFER_SIZE before the timeout expires, OR 2. The timeout is reached NOTE: If the PubSubMessage came from disk (after restart), the following fields will be set to sentinel values since record metadata information is not persisted to reduce disk utilization: - PubSubMessageTime - Position
      Specified by:
      poll in interface StatefulVeniceChangelogConsumer<K,V>
      Specified by:
      poll in interface VeniceChangelogConsumer<K,V>
      Parameters:
      timeoutInMs - Maximum timeout of the poll invocation in milliseconds
      Returns:
      A collection of Venice PubSubMessages containing change events
    • isCaughtUp

      public boolean isCaughtUp()
      Description copied from interface: StatefulVeniceChangelogConsumer
      Indicates whether all subscribed partitions have caught up to the latest offset at the time of subscription. Once this becomes true, it will remain true even if the consumer begins to lag later on. This is for determining when the initial bootstrap phase has completed and the consumer has transitioned to consuming real-time events.
      Specified by:
      isCaughtUp in interface StatefulVeniceChangelogConsumer<K,V>
      Specified by:
      isCaughtUp in interface VeniceChangelogConsumer<K,V>
      Returns:
      True if all subscribed partitions have caught up to their target offsets.
    • getRecordTransformerConfig

      public DaVinciRecordTransformerConfig getRecordTransformerConfig()
    • getLastHeartbeatPerPartition

      public Map<Integer,Long> getLastHeartbeatPerPartition()
      Description copied from interface: StatefulVeniceChangelogConsumer
      Returns the timestamp of the last heartbeat received for each subscribed partition. Heartbeats are messages sent every minute by Venice servers to measure lag.
      Specified by:
      getLastHeartbeatPerPartition in interface StatefulVeniceChangelogConsumer<K,V>
      Specified by:
      getLastHeartbeatPerPartition in interface VeniceChangelogConsumer<K,V>
      Returns:
      a map of partition number to the timestamp, in milliseconds, of the last heartbeat received for that partition.
    • setBackgroundReporterThreadSleepIntervalSeconds

      protected void setBackgroundReporterThreadSleepIntervalSeconds(long interval)
    • clearPartitionState

      public void clearPartitionState(Set<Integer> partitions)
    • isStarted

      public boolean isStarted()
    • getSubscribedPartitions

      public Set<Integer> getSubscribedPartitions()
    • setStartTimeout

      public void setStartTimeout(long seconds)