Class VeniceHdfsSource

java.lang.Object
com.linkedin.venice.spark.input.hdfs.VeniceHdfsSource
All Implemented Interfaces:
org.apache.spark.sql.connector.catalog.TableProvider

public class VeniceHdfsSource extends Object implements org.apache.spark.sql.connector.catalog.TableProvider
This is the entrypoint of the Avro input source. It is used by Spark to create a DataFrame from a directory on HDFS. The directory must contain either Avro or Vson files. The format of input files must be homogenous, i.e., it cannot contain mixed formats or schemas.
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    org.apache.spark.sql.connector.catalog.Table
    getTable(org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning, Map<String,String> configs)
     
    org.apache.spark.sql.types.StructType
    inferSchema(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.apache.spark.sql.connector.catalog.TableProvider

    inferPartitioning, supportsExternalMetadata
  • Constructor Details

    • VeniceHdfsSource

      public VeniceHdfsSource()
  • Method Details

    • inferSchema

      public org.apache.spark.sql.types.StructType inferSchema(org.apache.spark.sql.util.CaseInsensitiveStringMap options)
      Specified by:
      inferSchema in interface org.apache.spark.sql.connector.catalog.TableProvider
    • getTable

      public org.apache.spark.sql.connector.catalog.Table getTable(org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning, Map<String,String> configs)
      Specified by:
      getTable in interface org.apache.spark.sql.connector.catalog.TableProvider