Interface ExternalStorageWriter

All Superinterfaces:
AutoCloseable, Closeable

public interface ExternalStorageWriter extends Closeable
VPJ-side dual-write sink. The Venice Push Job invokes implementations of this interface alongside Kafka produce so that each record lands both in Venice and in a configured external storage system.

Lifecycle (one instance per Spark/MR partition writer task):

  1. configure(VeniceProperties, String, int) once, right after no-arg construction.
  2. batchPut(List) per buffered batch of records. The wrapper guarantees records inside a single batch are in the same order as they were submitted; ordering across batches is preserved as well.
  3. flush() once, before Closeable.close(), to force durability of any buffered records.
  4. Closeable.close() once.

Implementations are loaded reflectively by class name from the VPJ config key push.job.external.storage.writer.class. Must have a public no-arg constructor. Must be idempotent on key — Spark task retries can replay a partition.

batchPut is synchronous from the caller's point of view: it must throw on durable-write failure so the Spark task fails and is retried. Implementations may buffer internally as long as flush() drains.

Deletes are not part of this SPI. Dual-write to external storage targets batch-only stores from clean batch-push input, where every record carries a non-null value and no delete is ever produced (Venice's batch-push path only emits deletes for KIF repush with TTL tombstones, which is out of scope for dual-write). The DualWriteVeniceWriter that wraps this SPI throws UnsupportedOperationException from its delete overrides so a stray delete fails the Spark task loudly rather than silently leaving the external sink in a divergent state.

  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Durable bulk-write of records.
    void
    configure(VeniceProperties jobProps, String topicName, int partitionId)
    Called once on the executor after no-arg construction.
    void
    Force durability of any buffered records.

    Methods inherited from interface java.io.Closeable

    close
  • Method Details

    • configure

      void configure(VeniceProperties jobProps, String topicName, int partitionId)
      Called once on the executor after no-arg construction. Implementations should read whatever job-level configuration they need from jobProps and prepare any per-task resources (connection pools, output paths, etc.).
    • batchPut

      void batchPut(List<ExternalStorageRecord> records)
      Durable bulk-write of records. The list may contain one or more records; an empty list is a no-op. Implementations should treat this as a synchronous write — return only after every record is durable, or throw.
    • flush

      void flush()
      Force durability of any buffered records. Called by the partition writer before Closeable.close().