Package com.linkedin.venice.hadoop
Interface InputDataInfoProvider
- All Superinterfaces:
AutoCloseable,Closeable
- All Known Implementing Classes:
DefaultInputDataInfoProvider,KafkaInputDataInfoProvider
This interface lets users get input data information
-
Nested Class Summary
Nested ClassesModifier and TypeInterfaceDescriptionstatic classA POJO that contains input data information (schema information and input data file size) -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.avro.SchemaextractAvroSubSchema(org.apache.avro.Schema origin, String fieldName) longgetInputLastModificationTime(String inputUri) initZstdConfig(int numFiles) static voidloadZstdTrainingSamples(VeniceRecordIterator recordIterator, PushJobZstdConfig pushJobZstdConfig) This function loads training samples from recordReader abstraction for building the Zstd dictionary.byte[]validateInputAndGetInfo(String inputUri)
-
Method Details
-
validateInputAndGetInfo
- Throws:
Exception
-
initZstdConfig
-
loadZstdTrainingSamples
static void loadZstdTrainingSamples(VeniceRecordIterator recordIterator, PushJobZstdConfig pushJobZstdConfig) This function loads training samples from recordReader abstraction for building the Zstd dictionary.- Parameters:
recordIterator- The data accessor of input records.
-
trainZstdDictionary
byte[] trainZstdDictionary() -
extractAvroSubSchema
-
getInputLastModificationTime
- Throws:
IOException
-