Class InternalDaVinciRecordTransformer<K,V,O>

java.lang.Object
com.linkedin.davinci.client.DaVinciRecordTransformer<K,V,O>
com.linkedin.davinci.client.InternalDaVinciRecordTransformer<K,V,O>
Type Parameters:
K - type of the input key
V - type of the input value
O - type of the output value
All Implemented Interfaces:
Closeable, AutoCloseable

@Experimental public class InternalDaVinciRecordTransformer<K,V,O> extends DaVinciRecordTransformer<K,V,O>
This is an implementation of DaVinciRecordTransformer that is used internally inside the StoreIngestionTask.
  • Constructor Details

    • InternalDaVinciRecordTransformer

      public InternalDaVinciRecordTransformer(DaVinciRecordTransformer recordTransformer, org.apache.avro.Schema keySchema, org.apache.avro.Schema inputValueSchema, org.apache.avro.Schema outputValueSchema, InternalDaVinciRecordTransformerConfig internalRecordTransformerConfig)
  • Method Details

    • transform

      public DaVinciRecordTransformerResult<O> transform(Lazy<K> key, Lazy<V> value, int partitionId, DaVinciRecordTransformerRecordMetadata recordMetadata)
      Description copied from class: DaVinciRecordTransformer
      Callback to transform records before they are stored. This can be useful for tasks such as filtering out unused fields to save storage space. Result types: - SKIP: drop the record - UNCHANGED: keep the original value; useful when only callbacks/notifications are needed - TRANSFORMED: record has been transformed and should be persisted to disk
      Specified by:
      transform in class DaVinciRecordTransformer<K,V,O>
      Parameters:
      key - the key of the record to be transformed
      value - the value of the record to be transformed
      partitionId - what partition the record came from
      recordMetadata - returns DaVinciRecordTransformerRecordMetadata if enabled in DaVinciRecordTransformerConfig, null otherwise
      Returns:
      DaVinciRecordTransformerResult
    • processPut

      public void processPut(Lazy<K> key, Lazy<O> value, int partitionId, DaVinciRecordTransformerRecordMetadata recordMetadata)
      Description copied from class: DaVinciRecordTransformer
      Callback for put/update events (for example, write to an external system). Invoked after DaVinciRecordTransformer.transform(Lazy, Lazy, int, DaVinciRecordTransformerRecordMetadata) when the result is UNCHANGED or TRANSFORMED. Also invoked during startup/recovery when Da Vinci replays records persisted on disk to rehydrate external systems.
      Specified by:
      processPut in class DaVinciRecordTransformer<K,V,O>
      Parameters:
      key - the key of the record to be put
      value - the value of the record to be put, either the original value or the transformed value
      partitionId - what partition the record came from
      recordMetadata - returns DaVinciRecordTransformerRecordMetadata if enabled in DaVinciRecordTransformerConfig, null otherwise.
    • processDelete

      public void processDelete(Lazy<K> key, int partitionId, DaVinciRecordTransformerRecordMetadata recordMetadata)
      Description copied from class: DaVinciRecordTransformer
      Optional callback for delete events (for example, remove from an external system). By default, this is a no-op.
      Overrides:
      processDelete in class DaVinciRecordTransformer<K,V,O>
      Parameters:
      key - the key of the record to be deleted
      partitionId - what partition the record is being deleted from
      recordMetadata - returns DaVinciRecordTransformerRecordMetadata if enabled in DaVinciRecordTransformerConfig, null otherwise.
    • onStartVersionIngestion

      public void onStartVersionIngestion(boolean isCurrentVersion)
      Description copied from class: DaVinciRecordTransformer
      Callback invoked before consuming records for DaVinciRecordTransformer.storeVersion. Use this to open connections, create tables, or initialize resources. By default, it's a no-op.
      Overrides:
      onStartVersionIngestion in class DaVinciRecordTransformer<K,V,O>
      Parameters:
      isCurrentVersion - whether the version being started is the current serving version; if not it's a future version
    • onEndVersionIngestion

      public void onEndVersionIngestion(int currentVersion)
      Description copied from class: DaVinciRecordTransformer
      Callback invoked when record consumption stops for DaVinciRecordTransformer.storeVersion. Use this to close connections, drop tables, or release resources. This can be triggered when this DaVinciRecordTransformer.storeVersion is no longer the current serving version (for example, after a version swap), or when the application is shutting down. For batch stores, this callback may be invoked even when the application is not shutting down, with DaVinciRecordTransformer.storeVersion equaling the currentVersion parameter. This occurs because batch stores do not continuously receive updates after batch ingestion completes unlike hybrid stores, so the consumer resources are closed to free up resources once all data has been consumed. By default, it's a no-op.
      Overrides:
      onEndVersionIngestion in class DaVinciRecordTransformer<K,V,O>
      Parameters:
      currentVersion - the current serving version at the time this callback is invoked
    • useUniformInputValueSchema

      public boolean useUniformInputValueSchema()
      Description copied from class: DaVinciRecordTransformer
      Whether to deserialize input values using a single, uniform schema. If true (recommended only when you need a consistent field set), it uses the latest registered value schema at startup time to deserialize all records. This is useful when projecting into systems that expect a fixed schema (for example, relational tables). If false (default), each record is deserialized with its own writer schema. Note: This flag only applies to GenericRecord values. When using SpecificRecord for values via DaVinciRecordTransformerConfig.Builder.setOutputValueClass(Class), values will be deserialized based on the schema passed into DaVinciRecordTransformerConfig.Builder.setOutputValueSchema(org.apache.avro.Schema).
      Overrides:
      useUniformInputValueSchema in class DaVinciRecordTransformer<K,V,O>
    • onVersionSwap

      public void onVersionSwap(int currentVersion, int futureVersion, int partitionId)
      Lifecycle event triggered when a version swap is detected for partitionId It is used for DVRT CDC.
    • onHeartbeat

      public void onHeartbeat(int partitionId, long heartbeatTimestamp)
      Lifecycle event triggered when a heartbeat is detected for partitionId. It is used for DVRT CDC to record latest heartbeat timestamps per partition.
    • internalOnRecovery

      public void internalOnRecovery(StorageEngine storageEngine, int partitionId, InternalAvroSpecificSerializer<PartitionState> partitionStateSerializer, Lazy<VeniceCompressor> compressor, PubSubContext pubSubContext, Map<Integer,org.apache.avro.Schema> schemaIdToSchemaMap, ReadOnlySchemaRepository schemaRepository)
      Using a wrapper around onRecovery because when calculating the class hash it grabs the name of the current class that is invoking it. If we directly invoke onRecovery from this class, the class hash will be calculated based on the contents of InternalDaVinciRecordTransformer, not the user's implementation of DVRT. We also can't override onRecovery like the other methods because this method is final and the implementation should never be overridden.
    • getCountDownStartConsumptionLatchCount

      public long getCountDownStartConsumptionLatchCount()
    • countDownStartConsumptionLatch

      public void countDownStartConsumptionLatch()
      This method gets invoked when local RocksDB scan has completed for a partition.
    • close

      public void close() throws IOException
      Throws:
      IOException