Package com.linkedin.venice.spark.chunk
Class SparkChunkAssembler
java.lang.Object
com.linkedin.venice.spark.chunk.SparkChunkAssembler
- All Implemented Interfaces:
Serializable
Spark adapter for ChunkAssembler that handles chunked values and RMDs.
Converts Spark Rows to the format expected by ChunkAssembler, assembles chunks,
and returns the result as a Spark Row.
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionSparkChunkAssembler(boolean isRmdChunkingEnabled) SparkChunkAssembler(boolean isRmdChunkingEnabled, boolean isTTLFilteringEnabled, VeniceProperties filterProperties) -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.spark.sql.RowassembleChunks(byte[] keyBytes, Iterator<org.apache.spark.sql.Row> rows) Assemble chunks for a single key.
-
Constructor Details
-
SparkChunkAssembler
public SparkChunkAssembler(boolean isRmdChunkingEnabled) -
SparkChunkAssembler
public SparkChunkAssembler(boolean isRmdChunkingEnabled, boolean isTTLFilteringEnabled, VeniceProperties filterProperties)
-
-
Method Details
-
assembleChunks
public org.apache.spark.sql.Row assembleChunks(byte[] keyBytes, Iterator<org.apache.spark.sql.Row> rows) Assemble chunks for a single key. If TTL filtering is enabled, also filters the assembled record.- Parameters:
keyBytes- The key bytesrows- Iterator of rows for this key (MUST be sorted by offset DESC - highest offset first)- Returns:
- Assembled row with DEFAULT_SCHEMA_WITH_SCHEMA_ID schema, or null if DELETE, incomplete chunks, or filtered by TTL
-