Class AbstractDataWriterSparkJob

    • Constructor Detail

      • AbstractDataWriterSparkJob

        public AbstractDataWriterSparkJob()
    • Method Detail

      • getSparkSession

        protected org.apache.spark.sql.SparkSession getSparkSession()
      • getUserInputDataFrame

        protected abstract org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> getUserInputDataFrame()
        Get the data frame based on the user's input data. The schema of the Row has the following constraints:
        • Must contain a field "key" with the schema: DataTypes.BinaryType. This is the key of the record represented in serialized Avro.
        • Must contain a field "value" with the schema: DataTypes.BinaryType. This is the value of the record represented in serialized Avro.
        • Must not contain fields with names beginning with "_". These are reserved for internal use.
        • Can contain fields that do not violate the above constraints
        Returns:
        The data frame based on the user's input data
      • setInputConf

        protected void setInputConf​(org.apache.spark.sql.SparkSession session,
                                    org.apache.spark.sql.DataFrameReader dataFrameReader,
                                    java.lang.String key,
                                    java.lang.String value)
      • close

        public void close()
                   throws java.io.IOException
        Throws:
        java.io.IOException