Class AbstractPartitionWriter

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable
    Direct Known Subclasses:
    SparkPartitionWriter, VeniceReducer

    public abstract class AbstractPartitionWriter
    extends AbstractDataWriterTask
    implements java.io.Closeable
    An abstraction of the task that processes all key/value pairs, checks for duplicates and emits the final key/value pairs to Venice's PubSub.
    • Constructor Detail

      • AbstractPartitionWriter

        public AbstractPartitionWriter()
    • Method Detail

      • processValuesForKey

        public void processValuesForKey​(byte[] key,
                                        java.util.Iterator<byte[]> values,
                                        DataWriterTaskTracker dataWriterTaskTracker)
      • getDerivedValueSchemaId

        protected int getDerivedValueSchemaId()
      • isEnableWriteCompute

        protected boolean isEnableWriteCompute()
      • hasReportedFailure

        protected boolean hasReportedFailure​(DataWriterTaskTracker dataWriterTaskTracker,
                                             boolean isDuplicateKeyAllowed)
      • getExceedQuotaFlag

        protected boolean getExceedQuotaFlag()
      • close

        public void close()
                   throws java.io.IOException
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Throws:
        java.io.IOException
      • getTotalIncomingDataSizeInBytes

        protected long getTotalIncomingDataSizeInBytes()
        Return the size of serialized key and serialized value in bytes across the entire dataset. This is an optimization to skip writing the data to Kafka and reduce the load on Kafka and Venice storage nodes. Not all engines can support fetching this information during the execution of the job (eg Spark), but we can live with it for now. The quota is checked again in the Driver after the completion of the DataWriter job, and it will kill the VenicePushJob soon after.
        Returns:
        the size of serialized key and serialized value in bytes across the entire dataset
      • recordMessageErrored

        protected void recordMessageErrored​(java.lang.Exception e)
      • logMessageProgress

        protected void logMessageProgress()
      • setExceedQuota

        protected void setExceedQuota​(boolean exceedQuota)