Class RocksDBStoragePartition
java.lang.Object
com.linkedin.davinci.store.AbstractStoragePartition
com.linkedin.davinci.store.rocksdb.RocksDBStoragePartition
- Direct Known Subclasses:
ReplicationMetadataRocksDBStoragePartition
In
RocksDBStoragePartition
, it assumes the update(insert/delete) will happen sequentially.
If the batch push is bytewise-sorted by key, this class is leveraging SstFileWriter
to
generate the SST file directly and ingest all the generated SST files into the RocksDB database
at the end of the push.
If the ingestion is unsorted, this class is using the regular RocksDB interface to support update
operations.-
Field Summary
Modifier and TypeFieldDescriptionprotected final boolean
protected final List<org.rocksdb.ColumnFamilyDescriptor>
protected final List<org.rocksdb.ColumnFamilyHandle>
Column Family is the concept in RocksDB to create isolation between different value for the same key.protected final boolean
Whether the input is sorted or not.protected final int
protected static final org.rocksdb.ReadOptions
protected final ReentrantReadWriteLock
Since all the modification functions are synchronized, we don't need any other synchronization for the update path to guard RocksDB closing behavior.protected final boolean
Whether the database is read only or not.protected final boolean
protected final boolean
protected final String
protected org.rocksdb.RocksDB
protected final String
protected final String
protected final boolean
protected final org.rocksdb.WriteOptions
Here RocksDB disables WAL, but relies on the 'flush', which will be invoked throughsync()
to avoid data loss during recovery. -
Constructor Summary
ModifierConstructorDescriptionRocksDBStoragePartition
(StoragePartitionConfig storagePartitionConfig, RocksDBStorageEngineFactory factory, String dbDir, RocksDBMemoryStats rocksDBMemoryStats, RocksDBThrottler rocksDbThrottler, RocksDBServerConfig rocksDBServerConfig, VeniceStoreVersionConfig storeConfig) protected
RocksDBStoragePartition
(StoragePartitionConfig storagePartitionConfig, RocksDBStorageEngineFactory factory, String dbDir, RocksDBMemoryStats rocksDBMemoryStats, RocksDBThrottler rocksDbThrottler, RocksDBServerConfig rocksDBServerConfig, List<byte[]> columnFamilyNameList, VeniceStoreVersionConfig storeConfig) -
Method Summary
Modifier and TypeMethodDescriptionvoid
beginBatchWrite
(Map<String, String> checkpointedInfo, Optional<Supplier<byte[]>> expectedChecksumSupplier) boolean
checkDatabaseIntegrity
(Map<String, String> checkpointedInfo) checks whether the current state of the database is valid during the start of ingestion.void
close()
Close the specific partitionvoid
Creates a snapshot of the current state of the storage if the blob transfer feature is enabled via the store configurationvoid
delete
(byte[] key) Delete a key from the partition databasevoid
deleteFilesInDirectory
(String fullPath) void
drop()
Drop when it is not required anymore.void
byte[]
get
(byte[] key) Get a value from the partition databaseget
(byte[] key, ByteBuffer valueToBePopulated) byte[]
get
(ByteBuffer keyBuffer) <K,
V> V get
(K key) Get a Value from the partition databasegetApproximateMemoryUsageByType
(Set<org.rocksdb.Cache> caches) protected Boolean
void
getByKeyPrefix
(byte[] keyPrefix, BytesStreamingCallback callback) Populate provided callback with key-value pairs from the partition database where the keys have provided prefix.protected List<org.rocksdb.ColumnFamilyHandle>
protected org.rocksdb.EnvOptions
protected org.rocksdb.Options
long
Get the partition database size in byteslong
long
getRocksDBStatValue
(String statName) protected org.rocksdb.Options
getStoreOptions
(StoragePartitionConfig storagePartitionConfig, boolean isRMD) protected void
List<byte[]>
multiGet
(List<ByteBuffer> keys, List<ByteBuffer> values) void
put
(byte[] key, byte[] value) Puts a value into the partition databasevoid
put
(byte[] key, ByteBuffer valueBuffer) <K,
V> void put
(K key, V value) void
reopen()
Reopen the underlying RocksDB database, and this operation will unload the data cached in memory.sync()
Sync current database.boolean
boolean
verifyConfig
(StoragePartitionConfig partitionConfig) Methods inherited from class com.linkedin.davinci.store.AbstractStoragePartition
deleteWithReplicationMetadata, getPartitionId, getReplicationMetadata, putReplicationMetadata, putWithReplicationMetadata, putWithReplicationMetadata
-
Field Details
-
READ_OPTIONS_DEFAULT
protected static final org.rocksdb.ReadOptions READ_OPTIONS_DEFAULT -
writeOptions
protected final org.rocksdb.WriteOptions writeOptionsHere RocksDB disables WAL, but relies on the 'flush', which will be invoked throughsync()
to avoid data loss during recovery. -
replicaId
-
storeName
-
storeNameAndVersion
-
blobTransferEnabled
protected final boolean blobTransferEnabled -
partitionId
protected final int partitionId -
readCloseRWLock
Since all the modification functions are synchronized, we don't need any other synchronization for the update path to guard RocksDB closing behavior. The followingreadCloseRWLock
is only used to guardget(byte[])
since we don't want to synchronize get requests. -
rocksDB
protected org.rocksdb.RocksDB rocksDB -
deferredWrite
protected final boolean deferredWriteWhether the input is sorted or not.
deferredWrite = sortedInput => ingested via batch push which is sorted in VPJ, can useRocksDBSstFileWriter
to ingest the input data to RocksDB
!deferredWrite = !sortedInput => can not use RocksDBSstFileWriter for ingestion -
readOnly
protected final boolean readOnlyWhether the database is read only or not. -
writeOnly
protected final boolean writeOnly -
readWriteLeaderForDefaultCF
protected final boolean readWriteLeaderForDefaultCF -
readWriteLeaderForRMDCF
protected final boolean readWriteLeaderForRMDCF -
columnFamilyHandleList
Column Family is the concept in RocksDB to create isolation between different value for the same key. All KVs are stored in `DEFAULT` column family, if no column family is specified. If we stores replication metadata in the RocksDB, we stored it in a separated column family. We will insert all the column family descriptors into columnFamilyDescriptors and pass it to RocksDB when opening the store, and it will fill the columnFamilyHandles with handles which will be used when we want to put/get/delete from different RocksDB column families. -
columnFamilyDescriptors
-
-
Constructor Details
-
RocksDBStoragePartition
protected RocksDBStoragePartition(StoragePartitionConfig storagePartitionConfig, RocksDBStorageEngineFactory factory, String dbDir, RocksDBMemoryStats rocksDBMemoryStats, RocksDBThrottler rocksDbThrottler, RocksDBServerConfig rocksDBServerConfig, List<byte[]> columnFamilyNameList, VeniceStoreVersionConfig storeConfig) -
RocksDBStoragePartition
public RocksDBStoragePartition(StoragePartitionConfig storagePartitionConfig, RocksDBStorageEngineFactory factory, String dbDir, RocksDBMemoryStats rocksDBMemoryStats, RocksDBThrottler rocksDbThrottler, RocksDBServerConfig rocksDBServerConfig, VeniceStoreVersionConfig storeConfig)
-
-
Method Details
-
makeSureRocksDBIsStillOpen
protected void makeSureRocksDBIsStillOpen() -
getEnvOptions
protected org.rocksdb.EnvOptions getEnvOptions() -
getBlobTransferEnabled
-
getStoreOptions
protected org.rocksdb.Options getStoreOptions(StoragePartitionConfig storagePartitionConfig, boolean isRMD) -
getColumnFamilyHandleList
-
getRmdByteUsage
public long getRmdByteUsage()- Overrides:
getRmdByteUsage
in classAbstractStoragePartition
-
checkDatabaseIntegrity
Description copied from class:AbstractStoragePartition
checks whether the current state of the database is valid during the start of ingestion.- Overrides:
checkDatabaseIntegrity
in classAbstractStoragePartition
-
beginBatchWrite
public void beginBatchWrite(Map<String, String> checkpointedInfo, Optional<Supplier<byte[]>> expectedChecksumSupplier) - Overrides:
beginBatchWrite
in classAbstractStoragePartition
-
endBatchWrite
public void endBatchWrite()- Overrides:
endBatchWrite
in classAbstractStoragePartition
-
createSnapshot
public void createSnapshot()Description copied from class:AbstractStoragePartition
Creates a snapshot of the current state of the storage if the blob transfer feature is enabled via the store configuration- Specified by:
createSnapshot
in classAbstractStoragePartition
-
put
public void put(byte[] key, byte[] value) Description copied from class:AbstractStoragePartition
Puts a value into the partition database- Specified by:
put
in classAbstractStoragePartition
-
put
- Specified by:
put
in classAbstractStoragePartition
-
put
public <K,V> void put(K key, V value) - Specified by:
put
in classAbstractStoragePartition
-
get
public byte[] get(byte[] key) Description copied from class:AbstractStoragePartition
Get a value from the partition database- Specified by:
get
in classAbstractStoragePartition
- Parameters:
key
- key to be retrieved- Returns:
- null if the key does not exist, byte[] value if it exists.
-
get
- Overrides:
get
in classAbstractStoragePartition
-
get
public <K,V> V get(K key) Description copied from class:AbstractStoragePartition
Get a Value from the partition database- Specified by:
get
in classAbstractStoragePartition
- Type Parameters:
K
- the type for KeyV
- the type for the return value- Parameters:
key
- key to be retrieved- Returns:
- null if the key does not exist, V value if it exists
-
get
- Specified by:
get
in classAbstractStoragePartition
-
multiGet
-
multiGet
-
getByKeyPrefix
Description copied from class:AbstractStoragePartition
Populate provided callback with key-value pairs from the partition database where the keys have provided prefix. If prefix is null, callback will be populated will all key-value pairs from the partition database.- Specified by:
getByKeyPrefix
in classAbstractStoragePartition
-
validateBatchIngestion
public boolean validateBatchIngestion()- Overrides:
validateBatchIngestion
in classAbstractStoragePartition
-
delete
public void delete(byte[] key) Description copied from class:AbstractStoragePartition
Delete a key from the partition database- Specified by:
delete
in classAbstractStoragePartition
-
sync
Description copied from class:AbstractStoragePartition
Sync current database.- Specified by:
sync
in classAbstractStoragePartition
- Returns:
- Database related info, which is required to be checkpointed.
-
deleteFilesInDirectory
-
drop
public void drop()Description copied from class:AbstractStoragePartition
Drop when it is not required anymore.- Specified by:
drop
in classAbstractStoragePartition
-
close
public void close()Description copied from class:AbstractStoragePartition
Close the specific partition- Specified by:
close
in classAbstractStoragePartition
-
reopen
public void reopen()Reopen the underlying RocksDB database, and this operation will unload the data cached in memory.- Overrides:
reopen
in classAbstractStoragePartition
-
getRocksDBStatValue
-
getApproximateMemoryUsageByType
-
verifyConfig
- Specified by:
verifyConfig
in classAbstractStoragePartition
- Parameters:
partitionConfig
-- Returns:
-
getPartitionSizeInBytes
public long getPartitionSizeInBytes()Description copied from class:AbstractStoragePartition
Get the partition database size in bytes- Specified by:
getPartitionSizeInBytes
in classAbstractStoragePartition
- Returns:
- partition database size
-
getOptions
protected org.rocksdb.Options getOptions() -
getFullPathForTempSSTFileDir
-
getRocksDBSstFileWriter
-
getIterator
- Overrides:
getIterator
in classAbstractStoragePartition
-