Class VeniceSystemFactory

  • All Implemented Interfaces:
    java.io.Serializable, org.apache.samza.system.SystemFactory

    public class VeniceSystemFactory
    extends java.lang.Object
    implements org.apache.samza.system.SystemFactory, java.io.Serializable
    Samza jobs talk to either parent or child controller depending on the aggregate mode config. The decision of which controller should be used is made in VeniceSystemFactory. The "Primary Controller" term is used to refer to whichever controller the Samza job should talk to. The primary controller should be: 1. The parent controller when the Venice system is deployed in a multi-colo mode and either: a. PushType is PushType.BATCH or PushType.STREAM_REPROCESSING; or b. @deprecated PushType is PushType.STREAM and the job is configured to write data in AGGREGATE mode 2. The child controller when either: a. The Venice system is deployed in a single-colo mode; or b. The PushType is PushType.STREAM and the job is configured to write data in NON_AGGREGATE mode
    See Also:
    Serialized Form
    • Method Summary

      All Methods Instance Methods Concrete Methods Deprecated Methods 
      Modifier and Type Method Description
      protected org.apache.samza.system.SystemProducer createSystemProducer​(java.lang.String primaryControllerColoD2ZKHost, java.lang.String primaryControllerD2Service, java.lang.String storeName, Version.PushType venicePushType, java.lang.String samzaJobId, java.lang.String runningFabric, boolean verifyLatestProtocolPresent, org.apache.samza.config.Config config, java.util.Optional<SSLFactory> sslFactory, java.util.Optional<java.lang.String> partitioners)
      Deprecated.
      Left in to maintain backward compatibility
      protected org.apache.samza.system.SystemProducer createSystemProducer​(java.lang.String primaryControllerColoD2ZKHost, java.lang.String primaryControllerD2Service, java.lang.String storeName, Version.PushType venicePushType, java.lang.String samzaJobId, java.lang.String runningFabric, org.apache.samza.config.Config config, java.util.Optional<SSLFactory> sslFactory, java.util.Optional<java.lang.String> partitioners)
      Deprecated.
      Left in to maintain backward compatibility
      protected VeniceSystemProducer createSystemProducer​(java.lang.String veniceChildD2ZkHost, java.lang.String primaryControllerColoD2ZKHost, java.lang.String primaryControllerD2Service, java.lang.String storeName, Version.PushType venicePushType, java.lang.String samzaJobId, java.lang.String runningFabric, boolean verifyLatestProtocolPresent, java.util.Optional<SSLFactory> sslFactory, java.util.Optional<java.lang.String> partitioners)
      Construct a new instance of VeniceSystemProducer
      protected VeniceSystemProducer createSystemProducer​(java.lang.String veniceChildD2ZkHost, java.lang.String primaryControllerColoD2ZKHost, java.lang.String primaryControllerD2Service, java.lang.String storeName, Version.PushType venicePushType, java.lang.String samzaJobId, java.lang.String runningFabric, boolean verifyLatestProtocolPresent, org.apache.samza.config.Config config, java.util.Optional<SSLFactory> sslFactory, java.util.Optional<java.lang.String> partitioners)
      Construct a new instance of VeniceSystemProducer
      void endStreamReprocessingSystemProducer​(org.apache.samza.system.SystemProducer systemProducer, boolean jobSucceed)
      RouterBasedPushMonitor will update the status of a SystemProducer with push type STREAM_REPROCESSING: END_OF_PUSH_RECEIVED: isActive -> false; isStreamReprocessingJobSucceeded -> true COMPLETED: isActive -> false; isStreamReprocessingJobSucceeded -> true ERROR: isActive -> false; isStreamReprocessingJobSucceeded -> false For all the other push job status, SystemProducer status will not be updated.
      org.apache.samza.system.SystemAdmin getAdmin​(java.lang.String systemName, org.apache.samza.config.Config config)  
      VeniceSystemProducer getClosableProducer​(java.lang.String systemName, org.apache.samza.config.Config config, org.apache.samza.metrics.MetricsRegistry registry)
      Convenience method to hide the ugliness of casting in just one place.
      org.apache.samza.system.SystemConsumer getConsumer​(java.lang.String systemName, org.apache.samza.config.Config config, org.apache.samza.metrics.MetricsRegistry registry)  
      java.util.Optional<java.lang.String> getControllerDiscoveryUrl​(org.apache.samza.config.Config config)  
      int getNumberOfActiveSystemProducers()
      Get the total number of active SystemProducer.
      boolean getOverallExecutionStatus()
      Check whether all the stream reprocessing jobs have succeeded; return false if any of them fail.
      org.apache.samza.system.SystemProducer getProducer​(java.lang.String systemName, java.lang.String storeName, boolean veniceAggregate, java.lang.String pushTypeString, org.apache.samza.config.Config config)
      Samza table writer would directly call this function to create venice system producer instead of calling the general getProducer(String, Config, MetricsRegistry) function.
      org.apache.samza.system.SystemProducer getProducer​(java.lang.String systemName, org.apache.samza.config.Config config, org.apache.samza.metrics.MetricsRegistry registry)
      The core function of a SystemFactory; most Samza users would specify VeniceSystemFactory in the job config and Samza would invoke SystemFactory.getProducer(String, Config, MetricsRegistry) to create producers.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • LEGACY_CHILD_D2_ZK_HOSTS_PROPERTY

        public static final java.lang.String LEGACY_CHILD_D2_ZK_HOSTS_PROPERTY
        See Also:
        Constant Field Values
      • VENICE_PUSH_TYPE

        public static final java.lang.String VENICE_PUSH_TYPE
        See Also:
        Constant Field Values
      • VENICE_STORE

        public static final java.lang.String VENICE_STORE
        Venice store name Samza application is going to produce to.
        See Also:
        Constant Field Values
      • VENICE_AGGREGATE

        public static final java.lang.String VENICE_AGGREGATE
        Whether to leverage Venice aggregation. By default, it is 'false'. When the Samza application decides to leverage Venice aggregation, all the messages will be produced to Venice Parent cluster, otherwise, all the messages will be produced to local cluster.
        See Also:
        Constant Field Values
      • VENICE_CHILD_D2_ZK_HOSTS

        public static final java.lang.String VENICE_CHILD_D2_ZK_HOSTS
        D2 ZK hosts for Venice Child Cluster.
        See Also:
        Constant Field Values
      • VENICE_CONTROLLER_DISCOVERY_URL

        public static final java.lang.String VENICE_CONTROLLER_DISCOVERY_URL
        See Also:
        Constant Field Values
      • VENICE_ROUTER_URL

        public static final java.lang.String VENICE_ROUTER_URL
        See Also:
        Constant Field Values
      • VENICE_PARENT_D2_ZK_HOSTS

        public static final java.lang.String VENICE_PARENT_D2_ZK_HOSTS
        D2 ZK hosts for Venice Parent Cluster.
        See Also:
        Constant Field Values
      • VENICE_CHILD_CONTROLLER_D2_SERVICE

        public static final java.lang.String VENICE_CHILD_CONTROLLER_D2_SERVICE
        See Also:
        Constant Field Values
      • VENICE_PARENT_CONTROLLER_D2_SERVICE

        public static final java.lang.String VENICE_PARENT_CONTROLLER_D2_SERVICE
        See Also:
        Constant Field Values
      • LEGACY_VENICE_CHILD_CONTROLLER_D2_SERVICE

        public static final java.lang.String LEGACY_VENICE_CHILD_CONTROLLER_D2_SERVICE
        See Also:
        Constant Field Values
      • LEGACY_VENICE_PARENT_CONTROLLER_D2_SERVICE

        public static final java.lang.String LEGACY_VENICE_PARENT_CONTROLLER_D2_SERVICE
        See Also:
        Constant Field Values
    • Constructor Detail

      • VeniceSystemFactory

        public VeniceSystemFactory()
    • Method Detail

      • getAdmin

        public org.apache.samza.system.SystemAdmin getAdmin​(java.lang.String systemName,
                                                            org.apache.samza.config.Config config)
        Specified by:
        getAdmin in interface org.apache.samza.system.SystemFactory
      • getConsumer

        public org.apache.samza.system.SystemConsumer getConsumer​(java.lang.String systemName,
                                                                  org.apache.samza.config.Config config,
                                                                  org.apache.samza.metrics.MetricsRegistry registry)
        Specified by:
        getConsumer in interface org.apache.samza.system.SystemFactory
      • getControllerDiscoveryUrl

        public java.util.Optional<java.lang.String> getControllerDiscoveryUrl​(org.apache.samza.config.Config config)
      • createSystemProducer

        @Deprecated
        protected org.apache.samza.system.SystemProducer createSystemProducer​(java.lang.String primaryControllerColoD2ZKHost,
                                                                              java.lang.String primaryControllerD2Service,
                                                                              java.lang.String storeName,
                                                                              Version.PushType venicePushType,
                                                                              java.lang.String samzaJobId,
                                                                              java.lang.String runningFabric,
                                                                              org.apache.samza.config.Config config,
                                                                              java.util.Optional<SSLFactory> sslFactory,
                                                                              java.util.Optional<java.lang.String> partitioners)
        Deprecated.
        Left in to maintain backward compatibility
      • createSystemProducer

        @Deprecated
        protected org.apache.samza.system.SystemProducer createSystemProducer​(java.lang.String primaryControllerColoD2ZKHost,
                                                                              java.lang.String primaryControllerD2Service,
                                                                              java.lang.String storeName,
                                                                              Version.PushType venicePushType,
                                                                              java.lang.String samzaJobId,
                                                                              java.lang.String runningFabric,
                                                                              boolean verifyLatestProtocolPresent,
                                                                              org.apache.samza.config.Config config,
                                                                              java.util.Optional<SSLFactory> sslFactory,
                                                                              java.util.Optional<java.lang.String> partitioners)
        Deprecated.
        Left in to maintain backward compatibility
      • createSystemProducer

        protected VeniceSystemProducer createSystemProducer​(java.lang.String veniceChildD2ZkHost,
                                                            java.lang.String primaryControllerColoD2ZKHost,
                                                            java.lang.String primaryControllerD2Service,
                                                            java.lang.String storeName,
                                                            Version.PushType venicePushType,
                                                            java.lang.String samzaJobId,
                                                            java.lang.String runningFabric,
                                                            boolean verifyLatestProtocolPresent,
                                                            org.apache.samza.config.Config config,
                                                            java.util.Optional<SSLFactory> sslFactory,
                                                            java.util.Optional<java.lang.String> partitioners)
        Construct a new instance of VeniceSystemProducer
        Parameters:
        veniceChildD2ZkHost - D2 Zk Address where the components in the child colo are announcing themselves
        primaryControllerColoD2ZKHost - D2 Zk Address of the colo where the primary controller resides
        primaryControllerD2ServiceName - The service name that the primary controller uses to announce itself to D2
        storeName - The store to write to
        pushType - The PushType to use to write to the store
        samzaJobId - A unique id used to identify jobs that can concurrently write to the same store
        runningFabric - The colo where the job is running. It is used to find the best destination for the data to be written to
        verifyLatestProtocolPresent - Config to check whether the protocol versions used at runtime are valid in Venice backend
        factory - The VeniceSystemFactory object that was used to create this object
        config - A Config object that may be used by the factory implementation to create an overridden SystemProducer instance
        sslFactory - An optional SSLFactory that is used to communicate with other components using SSL
        partitioners - A list of comma-separated partitioners class names that are supported.
      • createSystemProducer

        protected VeniceSystemProducer createSystemProducer​(java.lang.String veniceChildD2ZkHost,
                                                            java.lang.String primaryControllerColoD2ZKHost,
                                                            java.lang.String primaryControllerD2Service,
                                                            java.lang.String storeName,
                                                            Version.PushType venicePushType,
                                                            java.lang.String samzaJobId,
                                                            java.lang.String runningFabric,
                                                            boolean verifyLatestProtocolPresent,
                                                            java.util.Optional<SSLFactory> sslFactory,
                                                            java.util.Optional<java.lang.String> partitioners)
        Construct a new instance of VeniceSystemProducer
        Parameters:
        veniceChildD2ZkHost - D2 Zk Address where the components in the child colo are announcing themselves
        primaryControllerColoD2ZKHost - D2 Zk Address of the colo where the primary controller resides
        primaryControllerD2ServiceName - The service name that the primary controller uses to announce itself to D2
        storeName - The store to write to
        pushType - The PushType to use to write to the store
        samzaJobId - A unique id used to identify jobs that can concurrently write to the same store
        runningFabric - The colo where the job is running. It is used to find the best destination for the data to be written to
        verifyLatestProtocolPresent - Config to check whether the protocol versions used at runtime are valid in Venice backend
        factory - The VeniceSystemFactory object that was used to create this object
        sslFactory - An optional SSLFactory that is used to communicate with other components using SSL
        partitioners - A list of comma-separated partitioners class names that are supported.
      • getProducer

        public org.apache.samza.system.SystemProducer getProducer​(java.lang.String systemName,
                                                                  java.lang.String storeName,
                                                                  boolean veniceAggregate,
                                                                  java.lang.String pushTypeString,
                                                                  org.apache.samza.config.Config config)
        Samza table writer would directly call this function to create venice system producer instead of calling the general getProducer(String, Config, MetricsRegistry) function.
      • getProducer

        public org.apache.samza.system.SystemProducer getProducer​(java.lang.String systemName,
                                                                  org.apache.samza.config.Config config,
                                                                  org.apache.samza.metrics.MetricsRegistry registry)
        The core function of a SystemFactory; most Samza users would specify VeniceSystemFactory in the job config and Samza would invoke SystemFactory.getProducer(String, Config, MetricsRegistry) to create producers.
        Specified by:
        getProducer in interface org.apache.samza.system.SystemFactory
      • getClosableProducer

        public VeniceSystemProducer getClosableProducer​(java.lang.String systemName,
                                                        org.apache.samza.config.Config config,
                                                        org.apache.samza.metrics.MetricsRegistry registry)
        Convenience method to hide the ugliness of casting in just one place. Ideally, we would change the return type of getProducer(String, Config, MetricsRegistry) to VeniceSystemProducer but since there are existing users of this API, we are being extra careful not to disturb it. TODO: clean this up when we have the bandwidth to coordinate the refactoring with the existing users.
      • getNumberOfActiveSystemProducers

        public int getNumberOfActiveSystemProducers()
        Get the total number of active SystemProducer. The SystemProducer for push type: STREAM and BATCH will always be at active state; so if there is any real-time SystemProducer in the Samza task, the task will not be stopped even though all the stream reprocessing jobs have completed. Besides, a Samza task can not have a mix of BATCH push type and STREAM_REPROCESSING push type; otherwise, the Samza task can not be automatically stopped.
      • getOverallExecutionStatus

        public boolean getOverallExecutionStatus()
        Check whether all the stream reprocessing jobs have succeeded; return false if any of them fail.
      • endStreamReprocessingSystemProducer

        public void endStreamReprocessingSystemProducer​(org.apache.samza.system.SystemProducer systemProducer,
                                                        boolean jobSucceed)
        RouterBasedPushMonitor will update the status of a SystemProducer with push type STREAM_REPROCESSING: END_OF_PUSH_RECEIVED: isActive -> false; isStreamReprocessingJobSucceeded -> true COMPLETED: isActive -> false; isStreamReprocessingJobSucceeded -> true ERROR: isActive -> false; isStreamReprocessingJobSucceeded -> false For all the other push job status, SystemProducer status will not be updated.