Class KafkaInputRecordReader

  • All Implemented Interfaces:
    java.lang.AutoCloseable, org.apache.hadoop.mapred.RecordReader<KafkaInputMapperKey,​KafkaInputMapperValue>

    public class KafkaInputRecordReader
    extends java.lang.Object
    implements org.apache.hadoop.mapred.RecordReader<KafkaInputMapperKey,​KafkaInputMapperValue>, java.lang.AutoCloseable
    We borrowed some idea from the open-sourced attic-crunch lib: https://github.com/apache/attic-crunch/blob/master/crunch-kafka/src/main/java/org/apache/crunch/kafka/record/KafkaRecordReader.java This class is used to read data off a Kafka topic partition. It will return the key bytes as unchanged, and extract the following fields and wrap them up as KafkaInputMapperValue as the value: 1. Value bytes. 2. Schema Id. 3. Offset. 4. Value type, which could be 'PUT' or 'DELETE'.