com.linkedin.venice.hadoop.input.kafka (venice API)

package com.linkedin.venice.hadoop.input.kafka

Related Packages

Package

Description

com.linkedin.venice.hadoop.input.kafka.avro

com.linkedin.venice.hadoop.input.kafka.chunk

com.linkedin.venice.hadoop.input.kafka.ttl
Classes

Class

Description

KafkaInputDictTrainer

Zstd dict trainer for Kafka Repush.

KafkaInputDictTrainer.Param

KafkaInputDictTrainer.ParamBuilder

KafkaInputFormat

We borrowed some idea from the open-sourced attic-crunch lib: https://github.com/apache/attic-crunch/blob/master/crunch-kafka/src/main/java/org/apache/crunch/kafka/record/KafkaInputFormat.java This InputFormat implementation is used to read data off a Kafka topic.

KafkaInputFormatCombiner

This class is a Combiner, which is a functionality of the MR framework where we can plug a Reducer implementation to be executed within the Mapper task, on its output.

KafkaInputKeyComparator

This class is used to support secondary sorting for KafkaInput Repush.

KafkaInputMRPartitioner

This class is used for KafkaInput Repush, and it only considers the key part of the composed key (ignoring the offset).

KafkaInputRecordReader

We borrowed some idea from the open-sourced attic-crunch lib: https://github.com/apache/attic-crunch/blob/master/crunch-kafka/src/main/java/org/apache/crunch/kafka/record/KafkaRecordReader.java This class is used to read data off a Kafka topic partition.

KafkaInputSplit

We borrowed some idea from the open-sourced attic-crunch lib: https://github.com/apache/attic-crunch/blob/master/crunch-kafka/src/main/java/org/apache/crunch/kafka/record/KafkaInputSplit.java InputSplit that represent retrieving data from a single PubSubTopicPartition between the specified start and end offsets.

KafkaInputUtils

KafkaInputValueGroupingComparator

This class together with KafkaInputKeyComparator supports secondary sorting of KafkaInput Repush.

VeniceKafkaInputMapper

This class is designed specifically for KafkaInputFormat, and right now, it is doing simple pass-through.

VeniceKafkaInputReducer

This class is designed specifically for KafkaInputFormat, and right now, it will pick up the latest entry according to the associated offset, and produce it to Kafka.

Package com.linkedin.venice.hadoop.input.kafka