Package com.linkedin.venice.listener
Class StoreValueSchemasCacheService
java.lang.Object
com.linkedin.venice.service.AbstractVeniceService
com.linkedin.venice.listener.StoreValueSchemasCacheService
- All Implemented Interfaces:
ReadOnlySchemaRepository
,VeniceResource
,Closeable
,AutoCloseable
public class StoreValueSchemasCacheService
extends AbstractVeniceService
implements ReadOnlySchemaRepository
This class implements the fast value schema/latest value schema lookup with acceptable delay.
The reason to introduce this class is that we found two issues to use
HelixReadOnlySchemaRepository
directly
in read compute path:
1. When ZK disconnect/re-connect happens, HelixReadOnlySchemaRepository
will refresh
its local cache, which would cause an increased GC count in read compute since HelixReadOnlySchemaRepository.refresh()
is holding a giant write lock and all the value schema/latest value schema lookups in the read compute requests will be blocked.
The GC count increase is significant (more than doubled in test cluster), which has been causing much higher CPU usage and higher latency;
2. The schema objects returned by HelixReadOnlySchemaRepository
for the same schema
are not always the same object since HelixReadOnlySchemaRepository.refresh()
would always re-create new Schema objects,
which will cause the inefficient de-serializer lookup in SerializerDeserializerFactory
, which will compare the schema objects
to find out the corresponding serializer/de-serializer (for read compute case, de-serializer is the concern). If the schema objects
are not the same, Schema.hashCode()
and Schema.equals(Object)
will be used, in Avro-1.7 or above, Schema.hashCode()
is
optimized to only calculate once if it is read-only, but Schema.equals(Object)
couldn't be avoided.
Here how it works in this class:
1. It maintains a mapping between stores and their value schemas and latest value schema;
2. It will try to reuse the same Schema
object for the same Schema Id within a store since value schema is immutable;
3. It maintains a refresh thread to update the local cache periodically;
In theory, all the schema lookups shouldn't be blocked by invoking the underlying HelixReadOnlySchemaRepository
since in reality,
it will take a fair long time to register a new value schema/latest value schema and start using it in prod, so the periodical schema
refresh should be able to take care of the new value schema/latest value schema discovery.
Since the refresh is async, there is a delay there (at most 1 min), and it should be acceptable because of the previous assumption.
So far, this class only supports value schema by id lookup and latest value schema lookup.-
Nested Class Summary
Nested classes/interfaces inherited from class com.linkedin.venice.service.AbstractVeniceService
AbstractVeniceService.ServiceState
-
Field Summary
Fields inherited from class com.linkedin.venice.service.AbstractVeniceService
logger, serviceState
-
Constructor Summary
ConstructorDescriptionStoreValueSchemasCacheService
(ReadOnlyStoreRepository storeRepository, ReadOnlySchemaRepository schemaRepository) -
Method Summary
Modifier and TypeMethodDescriptionvoid
clear()
getDerivedSchema
(String storeName, int valueSchemaId, int derivedSchemaId) getDerivedSchemaId
(String storeName, String derivedSchemaStr) Look up derived schema id and its corresponding value schema id by given store name and derived schema.getDerivedSchemas
(String storeName) getKeySchema
(String storeName) Get key schema for the given store.getLatestDerivedSchema
(String storeName, int valueSchemaId) Get the most recent derived schema added to the given store and value schema idgetReplicationMetadataSchema
(String storeName, int valueSchemaId, int replicationMetadataVersionId) getReplicationMetadataSchemas
(String storeName) getSupersetOrLatestValueSchema
(String storeName) Get the most recent value schema or superset value schema if one exists.getSupersetSchema
(String storeName) Get the superset value schema for a given store.getValueSchema
(String storeName, int valueSchemaId) Get value schema for the given store and value schema id.int
getValueSchemaId
(String storeName, String valueSchemaStr) Return the schema ID of any schema for the store that has the same parsing canonical form as the schema provided.getValueSchemas
(String storeName) Get all the value schemas for the given store.boolean
hasValueSchema
(String storeName, int id) Check whether the specified schema id is valid or notvoid
refresh()
boolean
void
Methods inherited from class com.linkedin.venice.service.AbstractVeniceService
close, getName, isRunning, start, stop
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface com.linkedin.venice.meta.ReadOnlySchemaRepository
getLatestDerivedSchema
-
Constructor Details
-
StoreValueSchemasCacheService
public StoreValueSchemasCacheService(ReadOnlyStoreRepository storeRepository, ReadOnlySchemaRepository schemaRepository)
-
-
Method Details
-
startInner
- Specified by:
startInner
in classAbstractVeniceService
- Returns:
- true if the service is completely started,
false if it is still starting asynchronously (in this case, it is the implementer's
responsibility to set
AbstractVeniceService.serviceState
toAbstractVeniceService.ServiceState.STARTED
upon completion of the async work). - Throws:
Exception
-
stopInner
- Specified by:
stopInner
in classAbstractVeniceService
- Throws:
Exception
-
getValueSchema
Description copied from interface:ReadOnlySchemaRepository
Get value schema for the given store and value schema id.- Specified by:
getValueSchema
in interfaceReadOnlySchemaRepository
-
getSupersetOrLatestValueSchema
Description copied from interface:ReadOnlySchemaRepository
Get the most recent value schema or superset value schema if one exists.- Specified by:
getSupersetOrLatestValueSchema
in interfaceReadOnlySchemaRepository
-
getSupersetSchema
Description copied from interface:ReadOnlySchemaRepository
Get the superset value schema for a given store. Each store has at most one active superset schema. Specifically a store must have some features enabled (e.g. read compute, write compute) to have a superset value schema which evolves as new value schemas are added.- Specified by:
getSupersetSchema
in interfaceReadOnlySchemaRepository
- Returns:
- Superset value schema or
null
if store does not have any superset value schema.
-
getKeySchema
Description copied from interface:ReadOnlySchemaRepository
Get key schema for the given store.- Specified by:
getKeySchema
in interfaceReadOnlySchemaRepository
-
hasValueSchema
Description copied from interface:ReadOnlySchemaRepository
Check whether the specified schema id is valid or not- Specified by:
hasValueSchema
in interfaceReadOnlySchemaRepository
-
getValueSchemaId
Description copied from interface:ReadOnlySchemaRepository
Return the schema ID of any schema for the store that has the same parsing canonical form as the schema provided.- Specified by:
getValueSchemaId
in interfaceReadOnlySchemaRepository
-
getValueSchemas
Description copied from interface:ReadOnlySchemaRepository
Get all the value schemas for the given store.- Specified by:
getValueSchemas
in interfaceReadOnlySchemaRepository
-
getDerivedSchemaId
Description copied from interface:ReadOnlySchemaRepository
Look up derived schema id and its corresponding value schema id by given store name and derived schema. This is likely used by clients that write to Venice- Specified by:
getDerivedSchemaId
in interfaceReadOnlySchemaRepository
- Returns:
- a pair where the first value is value schema id and the second value is derived schema id
-
getDerivedSchema
public DerivedSchemaEntry getDerivedSchema(String storeName, int valueSchemaId, int derivedSchemaId) - Specified by:
getDerivedSchema
in interfaceReadOnlySchemaRepository
-
getDerivedSchemas
- Specified by:
getDerivedSchemas
in interfaceReadOnlySchemaRepository
-
getLatestDerivedSchema
Description copied from interface:ReadOnlySchemaRepository
Get the most recent derived schema added to the given store and value schema id- Specified by:
getLatestDerivedSchema
in interfaceReadOnlySchemaRepository
-
getReplicationMetadataSchema
public RmdSchemaEntry getReplicationMetadataSchema(String storeName, int valueSchemaId, int replicationMetadataVersionId) - Specified by:
getReplicationMetadataSchema
in interfaceReadOnlySchemaRepository
-
getReplicationMetadataSchemas
- Specified by:
getReplicationMetadataSchemas
in interfaceReadOnlySchemaRepository
-
refresh
public void refresh()- Specified by:
refresh
in interfaceVeniceResource
-
clear
public void clear()- Specified by:
clear
in interfaceVeniceResource
-