Package com.linkedin.venice.hadoop
Class DefaultInputDataInfoProvider
java.lang.Object
com.linkedin.venice.hadoop.DefaultInputDataInfoProvider
- All Implemented Interfaces:
InputDataInfoProvider,Closeable,AutoCloseable
-
Nested Class Summary
Nested classes/interfaces inherited from interface com.linkedin.venice.hadoop.InputDataInfoProvider
InputDataInfoProvider.InputDataInfo -
Constructor Summary
ConstructorsConstructorDescriptionDefaultInputDataInfoProvider(PushJobSetting pushJobSetting, VeniceProperties props) -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()org.apache.avro.SchemaextractAvroSubSchema(org.apache.avro.Schema origin, String fieldName) longgetInputLastModificationTime(String inputUri) initZstdConfig(int numFiles) byte[]validateInputAndGetInfo(String inputUri) 1.
-
Constructor Details
-
DefaultInputDataInfoProvider
-
-
Method Details
-
validateInputAndGetInfo
public InputDataInfoProvider.InputDataInfo validateInputAndGetInfo(String inputUri) throws Exception 1. Check whether it's Vson input or Avro input 2. Check schema consistency; 3. Populate key schema, value schema; 4. Load samples for dictionary compression if enabled- Specified by:
validateInputAndGetInfoin interfaceInputDataInfoProvider- Parameters:
inputUri-- Returns:
- a
InputDataInfoProvider.InputDataInfothat contains input data information - Throws:
Exception
-
initZstdConfig
- Specified by:
initZstdConfigin interfaceInputDataInfoProvider
-
trainZstdDictionary
public byte[] trainZstdDictionary()- Specified by:
trainZstdDictionaryin interfaceInputDataInfoProvider
-
extractAvroSubSchema
- Specified by:
extractAvroSubSchemain interfaceInputDataInfoProvider
-
getInputLastModificationTime
- Specified by:
getInputLastModificationTimein interfaceInputDataInfoProvider- Throws:
IOException
-
close
public void close()- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable
-