java.lang.Object

com.linkedin.venice.jobs.DataWriterComputeJob

com.linkedin.venice.spark.datawriter.jobs.AbstractDataWriterSparkJob

All Implemented Interfaces:: ComputeJob, Closeable, AutoCloseable

Direct Known Subclasses:: DataWriterSparkJob

public abstract class AbstractDataWriterSparkJob extends DataWriterComputeJob

The implementation of DataWriterComputeJob for Spark engine.

Nested Class Summary

Nested classes/interfaces inherited from class com.linkedin.venice.jobs.DataWriterComputeJob
DataWriterComputeJob.ConfigSetter

Nested classes/interfaces inherited from interface com.linkedin.venice.jobs.ComputeJob
ComputeJob.Status
Field Summary

Fields inherited from class com.linkedin.venice.jobs.DataWriterComputeJob
PASS_THROUGH_CONFIG_PREFIXES
Constructor Summary

Constructors

Constructor

Description

AbstractDataWriterSparkJob()
Method Summary

Modifier and Type

Method

Description

void

close()

void

configure(VeniceProperties props, PushJobSetting pushJobSetting)

protected VeniceProperties

getJobProperties()

PushJobSetting

getPushJobSetting()

protected org.apache.spark.sql.SparkSession

getSparkSession()

DataWriterTaskTracker

getTaskTracker()

protected abstract org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>

getUserInputDataFrame()

Get the data frame based on the user's input data.

void

kill()

void

runComputeJob()

protected void

setInputConf(org.apache.spark.sql.SparkSession session, org.apache.spark.sql.DataFrameReader dataFrameReader, String key, String value)

Methods inherited from class com.linkedin.venice.jobs.DataWriterComputeJob
configure, getFailureReason, getStatus, populateWithPassThroughConfigs, populateWithPassThroughConfigs, runJob, validateJob

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- AbstractDataWriterSparkJob
  
  public AbstractDataWriterSparkJob()
Method Details
- configure
  
  public void configure(VeniceProperties props, PushJobSetting pushJobSetting)
  
  Specified by:
  
  configure in class DataWriterComputeJob
- getSparkSession
  
  protected org.apache.spark.sql.SparkSession getSparkSession()
- getUserInputDataFrame
  
  protected abstract org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> getUserInputDataFrame()
  Get the data frame based on the user's input data. The schema of the Row has the following constraints:
  
  Must contain a field "key" with the schema: DataTypes.BinaryType. This is the key of the record represented in serialized Avro.
  
  Must contain a field "value" with the schema: DataTypes.BinaryType. This is the value of the record represented in serialized Avro.
  
  Must not contain fields with names beginning with "_". These are reserved for internal use.
  
  Can contain fields that do not violate the above constraints
  Returns:
  
  The data frame based on the user's input data
- setInputConf
  
  protected void setInputConf(org.apache.spark.sql.SparkSession session, org.apache.spark.sql.DataFrameReader dataFrameReader, String key, String value)
- getTaskTracker
  
  public DataWriterTaskTracker getTaskTracker()
  
  Specified by:
  
  getTaskTracker in class DataWriterComputeJob
- getJobProperties
  
  protected VeniceProperties getJobProperties()
- getPushJobSetting
  
  public PushJobSetting getPushJobSetting()
  
  Specified by:
  
  getPushJobSetting in class DataWriterComputeJob
- runComputeJob
  
  public void runComputeJob()
  
  Specified by:
  
  runComputeJob in class DataWriterComputeJob
- kill
  
  public void kill()
  
  Specified by:
  
  kill in interface ComputeJob
  
  Overrides:
  
  kill in class DataWriterComputeJob
- close
  
  public void close() throws IOException
  
  Throws:
  
  IOException

Class AbstractDataWriterSparkJob

Nested Class Summary

Nested classes/interfaces inherited from class com.linkedin.venice.jobs.DataWriterComputeJob

Nested classes/interfaces inherited from interface com.linkedin.venice.jobs.ComputeJob

Field Summary

Fields inherited from class com.linkedin.venice.jobs.DataWriterComputeJob

Constructor Summary

Method Summary

Methods inherited from class com.linkedin.venice.jobs.DataWriterComputeJob

Methods inherited from class java.lang.Object

Constructor Details

AbstractDataWriterSparkJob

Method Details

configure

getSparkSession

getUserInputDataFrame

setInputConf

getTaskTracker

getJobProperties

getPushJobSetting

runComputeJob

kill

close