Class DaVinciRecordTransformer<K,V,O>

java.lang.Object
com.linkedin.davinci.client.DaVinciRecordTransformer<K,V,O>
Type Parameters:
K - the type of the input key
V - the type of the input value
O - the type of the output value
All Implemented Interfaces:
Closeable, AutoCloseable
Direct Known Subclasses:
BlockingDaVinciRecordTransformer, DuckDBDaVinciRecordTransformer

@Experimental public abstract class DaVinciRecordTransformer<K,V,O> extends Object implements Closeable
This abstract class can be extended in order to transform records stored in the Da Vinci Client, or write to a custom storage of your choice. The input is what is consumed from the raw Venice data set, whereas the output is what is stored into Da Vinci's local storage (e.g. RocksDB). Implementations should be thread-safe and support schema evolution. Note: Inputs are wrapped inside Lazy to avoid deserialization costs if not needed.
  • Constructor Details

    • DaVinciRecordTransformer

      public DaVinciRecordTransformer(int storeVersion, org.apache.avro.Schema keySchema, org.apache.avro.Schema inputValueSchema, org.apache.avro.Schema outputValueSchema, boolean storeRecordsInDaVinci)
      Parameters:
      storeVersion - the version of the store
      keySchema - the key schema, which is immutable inside DaVinciClient. Users can modify the key if they are storing records in an external storage engine, but this must be managed by the user
      inputValueSchema - the value schema before transformation
      outputValueSchema - the value schema after transformation
      storeRecordsInDaVinci - set this to false if you intend to store records in a custom storage, and not in the Da Vinci Client
  • Method Details

    • transform

      public abstract DaVinciRecordTransformerResult<O> transform(Lazy<K> key, Lazy<V> value)
      Implement this method to transform records before they are stored. This can be useful for tasks such as filtering out unused fields to save storage space.
      Parameters:
      key - the key of the record to be transformed
      value - the value of the record to be transformed
      Returns:
      DaVinciRecordTransformerResult
    • processPut

      public abstract void processPut(Lazy<K> key, Lazy<O> value)
      Implement this method to manage custom state outside the Da Vinci Client.
      Parameters:
      key - the key of the record to be put
      value - the value of the record to be put, derived from the output of transform(Lazy key, Lazy value)
    • processDelete

      public void processDelete(Lazy<K> key)
      Override this method to customize the behavior for record deletions. For example, you can use this method to delete records from a custom storage outside the Da Vinci Client. By default, it performs no operation.
      Parameters:
      key - the key of the record to be deleted
    • onStartVersionIngestion

      public void onStartVersionIngestion(boolean isCurrentVersion)
      Lifecycle event triggered before consuming records for storeVersion. Use this method to perform setup operations such as opening database connections or creating tables. By default, it performs no operation.
    • onEndVersionIngestion

      public void onEndVersionIngestion(int currentVersion)
      Lifecycle event triggered when record consumption is stopped for storeVersion. Use this method to perform cleanup operations such as closing database connections or dropping tables. By default, it performs no operation.
    • useUniformInputValueSchema

      public boolean useUniformInputValueSchema()
    • transformAndProcessPut

      public final DaVinciRecordTransformerResult<O> transformAndProcessPut(Lazy<K> key, Lazy<V> value)
      Transforms and processes the given record.
      Parameters:
      key - the key of the record to be put
      value - the value of the record to be put
      Returns:
      DaVinciRecordTransformerResult
    • prependSchemaIdToHeader

      public final ByteBuffer prependSchemaIdToHeader(O value, int schemaId, VeniceCompressor compressor)
      Serializes and compresses the value and prepends the schema ID to the resulting ByteBuffer.
      Parameters:
      value - to be serialized and compressed
      schemaId - to prepend to the ByteBuffer
      Returns:
      a ByteBuffer containing the schema ID followed by the serialized and compressed value
    • prependSchemaIdToHeader

      public final ByteBuffer prependSchemaIdToHeader(ByteBuffer valueBytes, int schemaId)
      Prepends the given schema ID to the provided ByteBuffer
      Parameters:
      valueBytes - the original serialized and compressed value
      schemaId - to prepend to the ByteBuffer
      Returns:
      a ByteBuffer containing the schema ID followed by the serialized and compressed value
    • getStoreVersion

      public final int getStoreVersion()
      Returns:
      storeVersion
    • getClassHash

      public final int getClassHash()
      Returns:
      the hash of the class bytecode
    • onRecovery

      public final void onRecovery(AbstractStorageEngine storageEngine, Integer partition, Lazy<VeniceCompressor> compressor)
      Bootstraps the client after it comes online.
    • getStoreRecordsInDaVinci

      public final boolean getStoreRecordsInDaVinci()
      Returns:
      storeRecordsInDaVinci
    • getKeySchema

      public final org.apache.avro.Schema getKeySchema()
      Returns the schema for the key used in DaVinciClient's operations.
      Returns:
      a Schema corresponding to the type of DaVinciRecordTransformer.
    • getInputValueSchema

      public final org.apache.avro.Schema getInputValueSchema()
      Returns the schema for the input value used in DaVinciClient's operations.
      Returns:
      a Schema corresponding to the type of DaVinciRecordTransformer.
    • getOutputValueSchema

      public final org.apache.avro.Schema getOutputValueSchema()
      Returns the schema for the output value used in DaVinciClient's operations.
      Returns:
      a Schema corresponding to the type of DaVinciRecordTransformer.
    • getRecordTransformerUtility

      public final DaVinciRecordTransformerUtility<K,O> getRecordTransformerUtility()
      Returns:
      recordTransformerUtility