Class DualWriteVeniceWriter

java.lang.Object
com.linkedin.venice.writer.AbstractVeniceWriter<byte[],byte[],byte[]>
com.linkedin.venice.hadoop.task.datawriter.DualWriteVeniceWriter
All Implemented Interfaces:
Closeable, AutoCloseable

public class DualWriteVeniceWriter extends AbstractVeniceWriter<byte[],byte[],byte[]>
Wraps a Kafka-backed AbstractVeniceWriter and an ExternalStorageWriter so each batch-push record is written to the external sink first and then produced to Kafka. The Kafka future is what the caller waits on for at-least-once semantics; external-sink durability is synchronous from the caller's point of view, so a failed external write throws before the Kafka produce starts and the Spark task is retried.

The writer buffers up to batchSize consecutive put records and flushes them as a single ExternalStorageWriter.batchPut(List) before the corresponding Kafka produces fire. batchSize = 1 (the default) disables buffering: every record is forwarded immediately as a one-element batch, matching pre-batching semantics. flush() and close() drain any pending buffer before doing their own work, so the producer's record ordering is preserved.

update and delete are not supported — batch pushes from clean input never call either. Both throw UnsupportedOperationException so a stray invocation fails the Spark task loudly rather than silently leaving the external sink and Venice in divergent states.