Class SparkPubSubInputPartitionReader

java.lang.Object
com.linkedin.venice.spark.input.pubsub.SparkPubSubInputPartitionReader
All Implemented Interfaces:
Closeable, AutoCloseable, org.apache.spark.sql.connector.read.PartitionReader<org.apache.spark.sql.catalyst.InternalRow>

public class SparkPubSubInputPartitionReader extends Object implements org.apache.spark.sql.connector.read.PartitionReader<org.apache.spark.sql.catalyst.InternalRow>
A Spark SQL data source partition reader implementation for Venice PubSub messages.

This reader consumes messages from a specific partition of a PubSub topic between specified start and end offsets, converting them into Spark's InternalRow format. The reader provides functionality for:

  • Reading messages from a specific topic partition
  • Filtering control messages when configured
  • Tracking consumption progress

This class is part of the Venice Spark connector enabling ETL and KIF functionality.

  • Constructor Details

  • Method Details

    • next

      public boolean next() throws IOException
      Specified by:
      next in interface org.apache.spark.sql.connector.read.PartitionReader<org.apache.spark.sql.catalyst.InternalRow>
      Throws:
      IOException
    • get

      public org.apache.spark.sql.catalyst.InternalRow get()
      Specified by:
      get in interface org.apache.spark.sql.connector.read.PartitionReader<org.apache.spark.sql.catalyst.InternalRow>
    • logProgressPercent

      public float logProgressPercent()
    • close

      public void close() throws IOException
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException