Class VeniceKafkaInputMapper
java.lang.Object
com.linkedin.venice.hadoop.task.datawriter.AbstractDataWriterTask
com.linkedin.venice.hadoop.task.datawriter.AbstractInputRecordProcessor<INPUT_KEY,INPUT_VALUE>
com.linkedin.venice.hadoop.mapreduce.datawriter.map.AbstractVeniceMapper<KafkaInputMapperKey,KafkaInputMapperValue>
com.linkedin.venice.hadoop.input.kafka.VeniceKafkaInputMapper
- All Implemented Interfaces:
Closeable,AutoCloseable,org.apache.hadoop.io.Closeable,org.apache.hadoop.mapred.JobConfigurable,org.apache.hadoop.mapred.Mapper<KafkaInputMapperKey,KafkaInputMapperValue, org.apache.hadoop.io.BytesWritable, org.apache.hadoop.io.BytesWritable>
public class VeniceKafkaInputMapper
extends AbstractVeniceMapper<KafkaInputMapperKey,KafkaInputMapperValue>
This class is designed specifically for
KafkaInputFormat, and right now, it is doing simple pass-through.-
Field Summary
Fields inherited from class com.linkedin.venice.hadoop.task.datawriter.AbstractInputRecordProcessor
EMPTY_BYTES, veniceRecordReaderFields inherited from class com.linkedin.venice.hadoop.task.datawriter.AbstractDataWriterTask
TASK_ID_NOT_SET -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()protected voidconfigureTask(VeniceProperties props) Allow implementations of this class to configure task-specific stuff.protected FilterChain<KafkaInputMapperValue>getFilterChain(VeniceProperties props) getRecordReader(VeniceProperties props) A method for child classes to setupAbstractInputRecordProcessor.veniceRecordReader.protected booleanprocess(KafkaInputMapperKey inputKey, KafkaInputMapperValue inputValue, byte[] rmd, AtomicReference<byte[]> keyRef, AtomicReference<byte[]> valueRef, AtomicReference<byte[]> rmdRef, DataWriterTaskTracker dataWriterTaskTracker) This function compresses the record and checks whether its uncompressed size exceeds the maximum allowed size.Methods inherited from class com.linkedin.venice.hadoop.mapreduce.datawriter.map.AbstractVeniceMapper
configure, mapMethods inherited from class com.linkedin.venice.hadoop.task.datawriter.AbstractInputRecordProcessor
processRecord, readDictionaryFromKafkaMethods inherited from class com.linkedin.venice.hadoop.task.datawriter.AbstractDataWriterTask
configure, getEngineTaskConfigProvider, getPartitionCount, getTaskId, isChunkingEnabled, isRmdChunkingEnabled, setChunkingEnabled
-
Constructor Details
-
VeniceKafkaInputMapper
public VeniceKafkaInputMapper()
-
-
Method Details
-
getRecordReader
protected AbstractVeniceRecordReader<KafkaInputMapperKey,KafkaInputMapperValue> getRecordReader(VeniceProperties props) Description copied from class:AbstractInputRecordProcessorA method for child classes to setupAbstractInputRecordProcessor.veniceRecordReader.- Specified by:
getRecordReaderin classAbstractInputRecordProcessor<KafkaInputMapperKey,KafkaInputMapperValue>
-
getFilterChain
-
configureTask
Description copied from class:AbstractDataWriterTaskAllow implementations of this class to configure task-specific stuff.- Overrides:
configureTaskin classAbstractInputRecordProcessor<KafkaInputMapperKey,KafkaInputMapperValue> - Parameters:
props- the job props that the task was configured with.
-
process
protected boolean process(KafkaInputMapperKey inputKey, KafkaInputMapperValue inputValue, byte[] rmd, AtomicReference<byte[]> keyRef, AtomicReference<byte[]> valueRef, AtomicReference<byte[]> rmdRef, DataWriterTaskTracker dataWriterTaskTracker) Description copied from class:AbstractInputRecordProcessorThis function compresses the record and checks whether its uncompressed size exceeds the maximum allowed size. Regardless of the configuration, it tracks uncompressed record size violations in theDataWriterTaskTracker. IfenableUncompressedMaxRecordSizeLimitis enabled, any record that exceeds the limit will be dropped from further processing.The metrics collected by this function will be exposed in the PushJobDetails system store. Downstream, the
trackUncompressedRecordTooLargeFailuremetric is used to verify that the job does not violate the maximum uncompressed record size constraint.If
trackUncompressedRecordTooLargeFailureis non-zero andenableUncompressedMaxRecordSizeLimitis enabled, the job will throw aVeniceExceptioninVenicePushJob.runJobAndUpdateStatus(), using the output ofVenicePushJob.updatePushJobDetailsWithJobDetails(DataWriterTaskTracker).When
enableUncompressedMaxRecordSizeLimitis enabled, no records will be produced to Kafka inAbstractPartitionWriter#processValuesForKey(byte[], Iterator, Iterator, DataWriterTaskTracker).- Overrides:
processin classAbstractInputRecordProcessor<KafkaInputMapperKey,KafkaInputMapperValue>
-
close
public void close()- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Overrides:
closein classAbstractInputRecordProcessor<KafkaInputMapperKey,KafkaInputMapperValue>
-