Class HelixReadWriteSchemaRepository

  • All Implemented Interfaces:
    ReadOnlySchemaRepository, ReadWriteSchemaRepository, VeniceResource

    public class HelixReadWriteSchemaRepository
    extends java.lang.Object
    implements ReadWriteSchemaRepository
    This class is used to add schema entries for stores. There are 4 types of schema entries 1. Key schema ZK Path: ${cluster_name}/Stores/${store_name}/key-schema/1 Each store only has 1 key schema and the schema is immutable. 2. Value schema ZK Path: ${cluster_name}/Stores/${store_name}/value-schema/${value_schema_id} Value schemas are evolvable. Stores can have multiple value schemas and each value schema is forwards/backwards compatible with others. 3. Derived schema ZK Path: ${cluster_name}/Stores/${store_name}/derived-schema/${value_schema_id}_${derived_schema_id} Each value schema can have multiple derived schemas. check out DerivedSchemaEntry for more details. 3. Replication metadata schema * ZK Path: ${cluster_name}/Stores/${store_name}/timestamp-metadata-schema/${value_schema_id}-${replication_metadata_version_id} * Check out SchemaEntrySerializer and DerivedSchemaEntrySerializer to see how schemas are ser-ded. ReadWriteSchemaRepository doesn't cache existing schemas locally and it always queries ZK for currently values. This is a different behavior compared to ReadOnlyStoreRepository where values always get cached and future update callbacks are registered. Notice: Users should not instantiate this class elsewhere than in the leader Controller and there should be always only 1 ReadWriteSchemaRepository per cluster. Instantiating multiple ReadWriteSchemaRepository will lead to race conditions in ZK.
    • Method Detail

      • getKeySchema

        public SchemaEntry getKeySchema​(java.lang.String storeName)
        Get key schema for the given store. Fetch from zookeeper directly.
        Specified by:
        getKeySchema in interface ReadOnlySchemaRepository
        null if key schema doesn't exist; schema entry if exists;
      • getValueSchema

        public SchemaEntry getValueSchema​(java.lang.String storeName,
                                          int id)
        Get value schema for the given store and schema id. Fetch from zookeeper directly.
        Specified by:
        getValueSchema in interface ReadOnlySchemaRepository
        null if the schema doesn't exist; schema entry if exists;
      • hasValueSchema

        public boolean hasValueSchema​(java.lang.String storeName,
                                      int id)
        Check whether the given value schema id exists in the given store or not. Fetch from zookeeper directly.
        Specified by:
        hasValueSchema in interface ReadOnlySchemaRepository
        null if the schema doesn't exist; schema entry if exists;
      • getValueSchemaId

        public int getValueSchemaId​(java.lang.String storeName,
                                    java.lang.String valueSchemaStr)
        This function is used to retrieve value schema id for the given store and schema. Attempts to get the schema that matches exactly. If multiple matching schemas are found then the id of the latest added schema is returned. If the store has auto-register schema from push job enabled then if the schema's differ by default value or doc field, they are treated as different schema.
        Specified by:
        getValueSchemaId in interface ReadOnlySchemaRepository
        SchemaData.INVALID_VALUE_SCHEMA_ID, if the schema doesn't exist in the given store; schema id (int), if the schema exists in the given store;
      • getValueSchemas

        public java.util.Collection<SchemaEntry> getValueSchemas​(java.lang.String storeName)
        This function is used to retrieve all the value schemas for the given store. Fetch from zookeeper directly.
        Specified by:
        getValueSchemas in interface ReadOnlySchemaRepository
      • getSupersetSchema

        public SchemaEntry getSupersetSchema​(java.lang.String storeName)
        Description copied from interface: ReadOnlySchemaRepository
        Get the superset value schema for a given store. Each store has at most one active superset schema. Specifically a store must have some features enabled (e.g. read compute, write compute) to have a superset value schema which evolves as new value schemas are added.
        Specified by:
        getSupersetSchema in interface ReadOnlySchemaRepository
        Superset value schema or null if store {@param storeName} does not have any superset value schema.
      • addValueSchema

        public SchemaEntry addValueSchema​(java.lang.String storeName,
                                          java.lang.String schemaStr,
                                          int schemaId)
        Description copied from interface: ReadWriteSchemaRepository
        Add a new value schema for the given store by specifying schema id. This API is mostly intended to be used in cross-colo mode. When there are multiple colos, we'd like to have consistent value id across colos, so that deserializer can work properly while reading records. Caller should figure out the schema id number by themselves. TODO: Might want to remove it from the interface and make it invisible from the outside
        Specified by:
        addValueSchema in interface ReadWriteSchemaRepository
      • preCheckValueSchemaAndGetNextAvailableId

        public int preCheckValueSchemaAndGetNextAvailableId​(java.lang.String storeName,
                                                            java.lang.String valueSchemaStr,
                                                            DirectionalSchemaCompatibilityType expectedCompatibilityType)
        Check if the incoming schema is a valid schema and return the next available schema ID. Venice pre-checks 3 things: 1. If the store is existing or not 2. If the incoming schema contains any reserved fields 3. If the incoming schema is duplicate with current's.
        Specified by:
        preCheckValueSchemaAndGetNextAvailableId in interface ReadWriteSchemaRepository
        next available ID if it's a valid schema or SchemaData.DUPLICATE_VALUE_SCHEMA_CODE if it's a duplicate
      • getDerivedSchemaId

        public GeneratedSchemaID getDerivedSchemaId​(java.lang.String storeName,
                                                    java.lang.String derivedSchemaStr)
        Description copied from interface: ReadOnlySchemaRepository
        Look up derived schema id and its corresponding value schema id by given store name and derived schema. This is likely used by clients that write to Venice
        Specified by:
        getDerivedSchemaId in interface ReadOnlySchemaRepository
        a pair where the first value is value schema id and the second value is derived schema id