Class VenicePubSubMessageToRow
java.lang.Object
com.linkedin.venice.spark.input.pubsub.VenicePubSubMessageToRow
- All Implemented Interfaces:
PubSubMessageConverter
Converts a PubSub message to a Spark InternalRow.
it preserves the schema, replication metadata, and other necessary fields
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.spark.sql.catalyst.InternalRow
convert
(@NotNull PubSubMessage<KafkaKey, KafkaMessageEnvelope, PubSubPosition> pubSubMessage, String region, int partitionNumber) Converts a PubSub message to a Spark InternalRow.static org.apache.spark.sql.catalyst.InternalRow
convertPubSubMessageToRow
(@NotNull PubSubMessage<KafkaKey, KafkaMessageEnvelope, PubSubPosition> pubSubMessage, String region, int partitionNumber) Static factory method to maintain backward compatibility.
-
Constructor Details
-
VenicePubSubMessageToRow
public VenicePubSubMessageToRow()
-
-
Method Details
-
convertPubSubMessageToRow
public static org.apache.spark.sql.catalyst.InternalRow convertPubSubMessageToRow(@NotNull @NotNull PubSubMessage<KafkaKey, KafkaMessageEnvelope, PubSubPosition> pubSubMessage, String region, int partitionNumber) Static factory method to maintain backward compatibility. -
convert
public org.apache.spark.sql.catalyst.InternalRow convert(@NotNull @NotNull PubSubMessage<KafkaKey, KafkaMessageEnvelope, PubSubPosition> pubSubMessage, String region, int partitionNumber) Converts a PubSub message to a Spark InternalRow.- Specified by:
convert
in interfacePubSubMessageConverter
- Parameters:
pubSubMessage
- The PubSub message to process. Contains key, value, and metadata.region
- The region identifier to include in the row.partitionNumber
- The partition number to include in the row.- Returns:
- An InternalRow containing the processed message data.
The row includes the following fields:
1. Region (String)
2. Partition number (int)
3. Message type (int)
4. Offset (long)
5. Schema ID (int)
6. Key bytes (byte[])
7. Value bytes (byte[])
8. Replication metadata payload bytes (byte[])
9. Replication metadata version ID (int)
See
SparkConstants.RAW_PUBSUB_INPUT_TABLE_SCHEMA
for the schema definition.
-