java.lang.Object

com.linkedin.venice.hadoop.task.datawriter.AbstractDataWriterTask

com.linkedin.venice.hadoop.task.datawriter.AbstractInputRecordProcessor<INPUT_KEY,INPUT_VALUE>

com.linkedin.venice.hadoop.mapreduce.datawriter.map.AbstractVeniceMapper<KafkaInputMapperKey,KafkaInputMapperValue>

com.linkedin.venice.hadoop.input.kafka.VeniceKafkaInputMapper

All Implemented Interfaces:: Closeable, AutoCloseable, org.apache.hadoop.io.Closeable, org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Mapper<KafkaInputMapperKey,KafkaInputMapperValue,org.apache.hadoop.io.BytesWritable,org.apache.hadoop.io.BytesWritable>

public class VeniceKafkaInputMapper extends AbstractVeniceMapper<KafkaInputMapperKey,KafkaInputMapperValue>

This class is designed specifically for KafkaInputFormat, and right now, it is doing simple pass-through.

Field Summary

Fields inherited from class com.linkedin.venice.hadoop.task.datawriter.AbstractInputRecordProcessor
veniceRecordReader

Fields inherited from class com.linkedin.venice.hadoop.task.datawriter.AbstractDataWriterTask
TASK_ID_NOT_SET
Constructor Summary

Constructors

Constructor

Description

VeniceKafkaInputMapper()
Method Summary

Modifier and Type

Method

Description

void

close()

protected void

configureTask(VeniceProperties props)

Allow implementations of this class to configure task-specific stuff.

protected FilterChain<KafkaInputMapperValue>

getFilterChain(VeniceProperties props)

protected AbstractVeniceRecordReader<KafkaInputMapperKey,KafkaInputMapperValue>

getRecordReader(VeniceProperties props)

A method for child classes to setup AbstractInputRecordProcessor.veniceRecordReader.

protected boolean

process(KafkaInputMapperKey inputKey, KafkaInputMapperValue inputValue, Long timestamp, AtomicReference<byte[]> keyRef, AtomicReference<byte[]> valueRef, AtomicReference<Long> timestampRef, DataWriterTaskTracker dataWriterTaskTracker)

This function compresses the record and checks whether its uncompressed size exceeds the maximum allowed size.

Methods inherited from class com.linkedin.venice.hadoop.mapreduce.datawriter.map.AbstractVeniceMapper
configure, map

Methods inherited from class com.linkedin.venice.hadoop.task.datawriter.AbstractInputRecordProcessor
processRecord, readDictionaryFromKafka

Methods inherited from class com.linkedin.venice.hadoop.task.datawriter.AbstractDataWriterTask
configure, getEngineTaskConfigProvider, getPartitionCount, getTaskId, isChunkingEnabled, isRmdChunkingEnabled, setChunkingEnabled

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- VeniceKafkaInputMapper
  
  public VeniceKafkaInputMapper()
Method Details
- getRecordReader
  
  protected AbstractVeniceRecordReader<KafkaInputMapperKey,KafkaInputMapperValue> getRecordReader(VeniceProperties props)
  
  Description copied from class: AbstractInputRecordProcessor
  
  A method for child classes to setup AbstractInputRecordProcessor.veniceRecordReader.
  
  Specified by:
  
  getRecordReader in class AbstractInputRecordProcessor<KafkaInputMapperKey,KafkaInputMapperValue>
- getFilterChain
  
  protected FilterChain<KafkaInputMapperValue> getFilterChain(VeniceProperties props)
- configureTask
  
  protected void configureTask(VeniceProperties props)
  
  Description copied from class: AbstractDataWriterTask
  
  Allow implementations of this class to configure task-specific stuff.
  
  Overrides:
  
  configureTask in class AbstractInputRecordProcessor<KafkaInputMapperKey,KafkaInputMapperValue>
  
  Parameters:
  
  props - the job props that the task was configured with.
- process
  
  protected boolean process(KafkaInputMapperKey inputKey, KafkaInputMapperValue inputValue, Long timestamp, AtomicReference<byte[]> keyRef, AtomicReference<byte[]> valueRef, AtomicReference<Long> timestampRef, DataWriterTaskTracker dataWriterTaskTracker)
  
  Description copied from class: AbstractInputRecordProcessor
  
  This function compresses the record and checks whether its uncompressed size exceeds the maximum allowed size. Regardless of the configuration, it tracks uncompressed record size violations in the DataWriterTaskTracker. If enableUncompressedMaxRecordSizeLimit is enabled, any record that exceeds the limit will be dropped from further processing.
  The metrics collected by this function will be exposed in the PushJobDetails system store. Downstream, the trackUncompressedRecordTooLargeFailure metric is used to verify that the job does not violate the maximum uncompressed record size constraint.
  If trackUncompressedRecordTooLargeFailure is non-zero and enableUncompressedMaxRecordSizeLimit is enabled, the job will throw a VeniceException in VenicePushJob.runJobAndUpdateStatus(), using the output of VenicePushJob.updatePushJobDetailsWithJobDetails(DataWriterTaskTracker).
  When enableUncompressedMaxRecordSizeLimit is enabled, no records will be produced to Kafka in AbstractPartitionWriter.processValuesForKey(byte[], Iterator, Iterator, DataWriterTaskTracker).
  
  Overrides:
  
  process in class AbstractInputRecordProcessor<KafkaInputMapperKey,KafkaInputMapperValue>
- close
  
  public void close()
  
  Specified by:
  
  close in interface AutoCloseable
  
  Specified by:
  
  close in interface Closeable
  
  Overrides:
  
  close in class AbstractInputRecordProcessor<KafkaInputMapperKey,KafkaInputMapperValue>

Class VeniceKafkaInputMapper

Field Summary

Fields inherited from class com.linkedin.venice.hadoop.task.datawriter.AbstractInputRecordProcessor

Fields inherited from class com.linkedin.venice.hadoop.task.datawriter.AbstractDataWriterTask

Constructor Summary

Method Summary

Methods inherited from class com.linkedin.venice.hadoop.mapreduce.datawriter.map.AbstractVeniceMapper

Methods inherited from class com.linkedin.venice.hadoop.task.datawriter.AbstractInputRecordProcessor

Methods inherited from class com.linkedin.venice.hadoop.task.datawriter.AbstractDataWriterTask

Methods inherited from class java.lang.Object

Constructor Details

VeniceKafkaInputMapper

Method Details

getRecordReader

getFilterChain

configureTask

process

close