Class DataWriterSparkJob

    • Constructor Detail

      • DataWriterSparkJob

        public DataWriterSparkJob()
    • Method Detail

      • getUserInputDataFrame

        protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> getUserInputDataFrame()
        Description copied from class: AbstractDataWriterSparkJob
        Get the data frame based on the user's input data. The schema of the Row has the following constraints:
        • Must contain a field "key" with the schema: DataTypes.BinaryType. This is the key of the record represented in serialized Avro.
        • Must contain a field "value" with the schema: DataTypes.BinaryType. This is the value of the record represented in serialized Avro.
        • Must not contain fields with names beginning with "_". These are reserved for internal use.
        • Can contain fields that do not violate the above constraints
        Specified by:
        getUserInputDataFrame in class AbstractDataWriterSparkJob
        Returns:
        The data frame based on the user's input data