Package com.linkedin.venice.hadoop
Interface InputDataInfoProvider
- All Superinterfaces:
AutoCloseable
,Closeable
- All Known Implementing Classes:
DefaultInputDataInfoProvider
,KafkaInputDataInfoProvider
This interface lets users get input data information
-
Nested Class Summary
Modifier and TypeInterfaceDescriptionstatic class
A POJO that contains input data information (schema information and input data file size) -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.avro.Schema
extractAvroSubSchema
(org.apache.avro.Schema origin, String fieldName) long
getInputLastModificationTime
(String inputUri) void
initZstdConfig
(int numFiles) static void
loadZstdTrainingSamples
(VeniceRecordIterator recordIterator, PushJobZstdConfig pushJobZstdConfig) This function loads training samples from recordReader abstraction for building the Zstd dictionary.byte[]
validateInputAndGetInfo
(String inputUri)
-
Method Details
-
validateInputAndGetInfo
- Throws:
Exception
-
initZstdConfig
void initZstdConfig(int numFiles) -
loadZstdTrainingSamples
static void loadZstdTrainingSamples(VeniceRecordIterator recordIterator, PushJobZstdConfig pushJobZstdConfig) This function loads training samples from recordReader abstraction for building the Zstd dictionary.- Parameters:
recordIterator
- The data accessor of input records.
-
trainZstdDictionary
byte[] trainZstdDictionary() -
extractAvroSubSchema
-
getInputLastModificationTime
- Throws:
IOException
-