solana

Commit Graph

Author	SHA1	Message	Date
Yueh-Hsuan Chiang	f910f2d7eb	Add comment block for Orphans column family. (#28340 ) Add comment block for Orphans column family.	2022-10-16 23:47:02 -07:00
Yueh-Hsuan Chiang	6a96a4c2ee	Add comment block for ErasureMeta ledger column (#28356 ) Add comment block for ErasureMeta ledger column.	2022-10-14 12:59:34 -07:00
Yueh-Hsuan Chiang	4539cb75fb	Add comment block for SlotMeta column family (#28339 ) Add comment block for SlotMeta column family.	2022-10-14 12:56:56 -07:00
Yueh-Hsuan Chiang	5df10173dd	Add comment block for BankHash ledger column (#28357 ) Add comment block for BankHash ledger column.	2022-10-13 11:41:58 -07:00
steviez	624f5cfcd5	Add rocksdb multi_get_bytes() method (#28244 )	2022-10-07 18:05:13 -04:00
Yueh-Hsuan Chiang	fb6abac4ca	Improve code comment for delete_range_cf (#28087 ) #### Summary of Changes Improve code comment for Blockstore::delete_range_cf esp. for the corner case where the from Slot and to Slot are the same.	2022-09-27 20:12:08 -07:00
Yueh-Hsuan Chiang	a76258f276	Improve code comments for ledgerstore columns. (#28054 ) ### Problem The documentation of each column family is missing ### Summary The goal is to create a comment block that will essentially include a high-level concept on what each column family is about and what are their key/value formats. This PR is the first cut that includes the key/value format of each column family. This should at least provide an easy pointer for readers to understand what this column family stores by searching its value type and how to access the data based on the key type.	2022-09-27 00:31:23 -07:00
behzad nouri	9a57c64f21	patches clippy errors from new rust nightly release (#27996 )	2022-09-22 22:23:03 +00:00
steviez	4e8e0cda7e	Remove extra data copy from several Rocks get() methods (#27693 ) Several of the get() methods return a deserialized object (as opposed to a Vec<u8>) by first getting a byte array out of Rocks, and then using bincode::deserialize() to get the underlying type. However, deserialize() only requires a u8 slice, not an owned Vec<u8>. So, we can use get_pinned_cf() to reference memory owned by Rocks and avoid an unnecessary copy.	2022-09-16 18:38:28 -05:00
Yueh-Hsuan Chiang	9831e4ddad	Remove daily rewrite/compaction of each ledger file (#27571 ) #### Problem Previously before #26651, our LedgerCleanupService needs RocksDB background compactions to reclaim ledger disk space via our custom CompactionFilter. However, since RocksDB's compaction isn't smart enough to know which file to pick, we rely on the 1-day compaction period so that each file will be forced to be compacted once a day so that we can reclaim ledger disk space in time. The downside of this is each ledger file will be rewritten once per day. #### Summary of Changes As #26651 makes LedgerCleanupService actively delete those files whose entire slot-range is older than both --limit-ledger-size and the current root, we can remove the 1-day compaction period and get rid of the daily ledger file rewrite. The results on mainnet-beta shows that this PR reduces ~20% write-bytes-per-second and reduces ~50% read-bytes-per-second on ledger disk.	2022-09-16 13:12:55 -07:00
Yueh-Hsuan Chiang	ba3d9cd325	Add LedgerColumn::multi_get() (#26354 ) #### Problem Blockstore operations such as get_slots_since() issues multiple rocksdb::get() at once which is not optimal for performance. #### Summary of Changes This PR adds LedgerColumn::multi_get() based on rocksdb::batched_multi_get(), the optimized version of multi_get() where get requests are processed in batch to minimize read I/O.	2022-09-12 15:01:22 -07:00
Yueh-Hsuan Chiang	ed00365101	Add ledger tool command print-file-metadata (#26790 ) Add ledger-tool command print-file-metadata #### Summary of Changes This PR adds a ledger tool subcommand print-file-metadata. ``` USAGE: solana-ledger-tool print-file-metadata [FLAGS] [OPTIONS] [SST_FILE_NAME] Prints the metadata of the specified ledger-store file. If no file name is unspecified, then it will print the metadata of all ledger files ```	2022-09-06 21:46:35 -07:00
Yueh-Hsuan Chiang	6d070cee08	Fix the boundary inconsistency between delete_file_in_range and delete_range (#27201 ) #### Problem RocksDB's delete_range applies to [from, to) while delete_file_in_range applies to [from, to] by default, and the rust-rocksdb api does not include the option to make delete_file_in_range apply to [from, to). Such inconsistency might cause `blockstore::run_purge` to produce an inconsistent result as it invokes both delete_range and delete_file_in_range. #### Summary of Changes This PR makes all our purge / delete related functions to be inclusive on both starting and ending slots.	2022-08-30 21:38:54 -07:00
steviez	14d9922105	Bump rust-rocksdb to 0.19.0 tag (#26949 )	2022-08-11 09:41:01 -05:00
Yueh-Hsuan Chiang	99ef2184cc	Delete files older than the lowest_cleanup_slot in LedgerCleanupService::cleanup_ledger (#26651 ) #### Problem LedgerCleanupService requires compactions to propagate & digest range-delete tombstones to eventually reclaim disk space. #### Summary of Changes This PR makes LedgerCleanupService::cleanup_ledger delete any file whose slot-range is older than the lowest_cleanup_slot. This allows us to reclaim disk space more often with fewer IOps. Experimental results on mainnet validators show that the PR can effectively reduce 33% to 40% ledger disk size.	2022-08-09 00:48:06 +08:00
Yueh-Hsuan Chiang	f284bba53b	Create const &strs for rocksdb perf write operation names (#26352 ) #### Summary of Changes Define PERF_METRIC_OP_NAME_PUT and PERF_METRIC_OP_NAME_WRITE_BATCH to replace repetitive / hard-coded operation names for report_rocksdb_write_perf.	2022-07-13 00:05:26 +08:00
Yueh-Hsuan Chiang	9985215bc8	Make report_rocksdb_read_perf() to take a operation name. (#26351 ) #### Problem report_rocksdb_read_perf() always uses the hard-coded operation name "get" #### Summary of Changes As we will add a new read operation -- multi_get(), report_rocksdb_read_perf() needs to have an input parameter for operation name.	2022-07-12 07:18:49 +08:00
Yueh-Hsuan Chiang	8674c96a66	Make the default values of FIFO compaction consistent with validator args (#25778 ) #### Problem When FIFO compaction is used, the size ratio between data shred and coding shred is set to 1:1 based on the `--rocksdb_fifo_shred_storage_size` arg. However, BlockstoreRocksFifoOptions::default() uses a slightly optimized 5:4 ratio instead, and the default() function is only used in benchmarks. #### Summary of Changes This PR makes both validator argument and BlockstoreRocksFifoOptions::default() to use 1:1 ratio between data and coding shred size.	2022-06-07 15:24:58 +08:00
Yueh-Hsuan Chiang	bcff88bf42	Use the new datapoint macro for RocksDB column family metrics (#25505 ) #### Summary of Changes Use the new datapoint macro that supports group-by for RocksDB column family metrics. By using the new macro, we can further remove large chunks of boilerplate code that try to work around the previous datapoint macro that does not support group-by.	2022-05-31 09:26:57 -07:00
Yueh-Hsuan Chiang	24634b6e25	Use the new datapoint macro that supports group-by for RocksDB read/write metrics. (#25392 ) #### Summary of Changes Use the new datapoint macro that supports group-by for RocksDB read/write perf metrics.	2022-05-26 22:17:29 -07:00
Yueh-Hsuan Chiang	5b67960c76	(Refactor) Move blocktore options related stuff to blockstore_options.rs (#25509 ) #### Problem blockstore_db.rs has a mutual dependency between blockstore_metrics.rs. #### Summary of Changes This PR removes the mutual dependency by moving the option-related stuff out from blockstore_db.rs to its new home --- blockstore_options.rs. By doing this, we address the mutual dependency and also make the code cleaner.	2022-05-26 16:59:26 -07:00
Michael Vines	b05c7d91ed	Fix derive_partial_eq_without_eq clippy lint	2022-05-22 22:22:21 -07:00
Yueh-Hsuan Chiang	d3dc2db9fb	(LedgerStore) Rate-limit RocksDB perf sample by a minimum time interval (#25100 ) #### Problem The current RocksDB read/write perf metrics do not include the total operation nanos and thus we have to include all fields that might contribute to the total operation nanos. #### Summary of Changes This PR includes the total operation nanos in RocksDB's read/write perf and reduces the number of reported fields in its perf metric.	2022-05-21 16:42:33 -07:00
Jeff Biseda	8caf0aabd1	framework to preserve optimistic_slot in blockstore (#25362 )	2022-05-20 16:46:23 -07:00
Yueh-Hsuan Chiang	de2033f2f2	(LedgerStore) Rate-limit RocksDB perf sample by a minimum time interval (#25093 ) #### Problem When the number of RocksDB read/write operations spikes, its payload size might exceed the limit (413 Payload Too Large). #### Summary of Changes This PR rate-limit the perf-sampling of RocksDB read/write operations by one second in addition to the existing sampling that is configurable via the hidden validator argument --rocksdb-perf-sample-interval.	2022-05-20 10:54:27 -07:00
Yueh-Hsuan Chiang	5625959f7e	(LedgerStore) Change perf_samples_counter from Arc<AtomicUsize> to AtomicUsize (#25043 ) #### Problem After #25042, each LedgerColumn has its own BlockstoreRocksDbWritePerfMetrics and BlockstoreRocksDbReadPerfMetrics instances. As it has total ownership, its member field does not need to use Arc. #### Summary of Changes Change perf_samples_counter from Arc<AtomicUsize> to AtomicUsize under BlockstoreRocksDbWritePerfMetrics and BlockstoreRocksDbReadPerfMetrics.	2022-05-16 11:31:07 -07:00
Yueh-Hsuan Chiang	b2dcda8980	(LedgerStore) Move metric sample counters out from LedgerColumnOptions (#25042 ) #### Problem LedgerColumnOptions contain two fields, perf_read_counter and perf_write_counter, that are not really options but internal counters. #### Summary of Changes This PR introduces BlockstoreRocksDbPerfSamplingStatus, a struct that holds internal status for RocksDB perf sampling and moves perf_read_counter and perf_write_counter out from LedgerColumnOptions.	2022-05-10 16:13:19 -07:00
Yueh-Hsuan Chiang	63bd0cdd5d	(LedgerStore) Move BlockstoreRocksDbColumnFamilyMetrics to blockstore_metric.rs (#24856 ) #### Problem blockstore_db.rs becomes bigger. #### Summary of Changes Move BlockstoreRocksDbColumnFamilyMetrics to blockstore_metric.rs out from blockstore_db.rs.	2022-05-03 14:46:59 -07:00
Yueh-Hsuan Chiang	0b9d04808f	(LedgerStore) Move trait ColumnMetrics and metric-macros to blockstore_metric.rs (#24855 ) #### Problem blockstore_db.rs becomes bigger. #### Summary of Changes Move trait ColumnMetrics and metric-macros to blockstore_metric.rs out from blockstore_db.rs.	2022-05-02 22:58:31 -07:00
Yueh-Hsuan Chiang	eca0eb9585	(LedgerStore) Move metric-related functions to blockstore_metric.rs (#24854 ) #### Problem blockstore_db.rs becomes bigger. #### Summary of Changes This PR creates blockstore_metric.rs and moves metric-related functions out from blockstore_db.rs.	2022-05-02 20:53:25 -07:00
steviez	428cf54c91	Change BlockStore TryPrimaryThenSecondary to just Secondary (#23391 )	2022-04-29 20:05:39 -05:00
Yueh-Hsuan Chiang	5245eb4229	(LedgerStore) Hidden validator argument for RocksDB perf samples (#24684 ) #### Summary of Changes This PR replaces the use of thread_rng in RocksDB perf metric samples by AtomicU32 with Ordering::Relaxed to improve the performance of determining whether to sample the current RocksDB's read/write perf metric.	2022-04-29 17:55:34 -07:00
Yueh-Hsuan Chiang	b56c091b37	(LedgerStore) Hidden validator argument for RocksDB perf samples (#24682 ) #### Problem Currently, the number of RocksDB perf samples is controlled by an env arg which is later handled using a lazy_static variable. However, there is a known performance overhead of using lazy_static as mentioned in https://github.com/solana-labs/solana/pull/6472. #### Summary of Changes Instead, this PR uses a hidden validator argument, --rocksdb-perf-sample-interval, for controlling how often RocksDB read/write performance sample is collected.	2022-04-29 15:28:50 -07:00
Yueh-Hsuan Chiang	27efcae16c	(LedgerStore) Convert Rocks from tuple to struct with named fields (#24761 ) #### Problem The RocksDB wrapper,`Rocks`, under blockstore_db is currently implemented as a tuple with unnamed fields. Accessing its fields requires syntax like `self.0` which limits readability. #### Summary of Changes This PR converts Rocks from tuple to struct so that it has more human-readable fields.	2022-04-28 21:32:48 -07:00
Yueh-Hsuan Chiang	077bc4f407	(LedgerStore) Change the default RocksDB perf sample rate to 1 / 1000. (#24234 )	2022-04-12 04:12:47 +00:00
Yueh-Hsuan Chiang	5a48ef72fd	(LedgerStore) Skip sampling check when ROCKSDB_PERF_CONTEXT_SAMPLES_IN_1K_DEFAULT = 0 (#24221 ) #### Problem Currently, even if SOLANA_METRICS_ROCKSDB_PERF_SAMPLES_IN_1K == 0, we are still doing the sampling check for every RocksDB read. ``` thread_rng().gen_range(0, METRIC_SAMPLES_1K) > *ROCKSDB_PERF_CONTEXT_SAMPLES_IN_1K ``` #### Summary of Changes This PR skips the sampling check when SOLANA_METRICS_ROCKSDB_PERF_SAMPLES_IN_1K is set to 0.	2022-04-11 20:39:46 -07:00
Yueh-Hsuan Chiang	1f136de294	(LedgerStore) Report perf metrics for RocksDB deletes (#24138 ) #### Summary of Changes This PR enables perf metrics reporting for RocksDB deletes. Samples are reported under "blockstore_rocksdb_write_perf" with op=delete The sampling rate is still controlled by env arg SOLANA_METRICS_ROCKSDB_PERF_SAMPLES_IN_1K and its default to 10 (meaning we report 10 in 1000 perf samples).	2022-04-08 00:18:05 -07:00
Yueh-Hsuan Chiang	b84521d47d	(LedgerStore) Report perf metrics for RocksDB write batch (#24061 ) #### Summary of Changes This PR enables perf metrics reporting for RocksDB write-batches. Samples are reported under "blockstore_rocksdb_write_perf" with op=write_batch Its cf_name tag is set to "write_batch" as well as each write-batch could include multiple column families. The sampling rate is still controlled by env arg SOLANA_METRICS_ROCKSDB_PERF_SAMPLES_IN_1K and its default to 10 (meaning we report 10 in 1000 perf samples).	2022-04-08 00:17:51 -07:00
Yueh-Hsuan Chiang	206c3dd402	(LedgerStore) Enable RocksDB Perf metrics reporting for get_bytes and put_bytes (#24066 ) #### Summary of Changes Enable RocksDB Perf metrics reporting for get_bytes and put_bytes.	2022-04-07 00:24:10 -07:00
Yueh-Hsuan Chiang	4f0e887702	(LedgerStore) Report RocksDB perf metrics for Protobuf Columns (#24065 ) This PR enables the reporting of both RocksDB read and write perf metrics for ProtobufColumns, including TransactionStatus and Rewards.	2022-04-07 00:15:00 -07:00
Yueh-Hsuan Chiang	2d1f27ed8e	(LedgerStore) Perf Metric for RocksDB Writes (#23951 ) #### Summary of Changes This PR implements the reporting of RocksDB write perf metrics to blockstore_rocksdb_write_perf based on RocksDB's PerfContext. The default sample rate is 10 in 1000, and the env arg SOLANA_METRICS_ROCKSDB_PERF_SAMPLES_IN_1K can control the sample rate.	2022-04-06 12:12:38 -07:00
Yueh-Hsuan Chiang	24cc6c33de	(LedgerStore)(Refactor) Move metric reporting functions to a dedicate mod (#24060 ) Previously, the metric reporting functions are implemented under LedgerColumnMetric. However, there're operations like write batch which is issued by the function inside Rocks. This PR moves reporting functions to its own dedicate mod so that both LedgerColumn and Rocks can report column perf metrics.	2022-04-05 15:06:17 -07:00
Yueh-Hsuan Chiang	0b5ed87220	(LedgerStore) Enable performance sampling in column family get() (#23834 ) #### Summary of Changes This PR enables RocksDB read side performance metrics to report to blockstore_rocksdb_read_perf. The sampling rate is controlled by an env arg `SOLANA_METRICS_ROCKSDB_PERF_SAMPLES_IN_1K`, specifies the number of perf samples for every 1000 operations. The default value is set to 10, meaning we will report 10 out of 1000 (or 1/100) reads. The metrics are based on the RocksDB [PerfContext](https://github.com/facebook/rocksdb/blob/main/include/rocksdb/perf_context.h). It includes many useful metrics including block read time, cache hit rate, and time spent on decompressing the block.	2022-04-01 13:13:32 -07:00
Yueh-Hsuan Chiang	c83c95b56b	(LedgerStore) Create ColumnMetrics trait for CF metric reporting (#23763 ) This PR does a refactoring on column family-related metrics reporting. As the metric reporting is per column family basis, the PR creates ColumnMetrics trait and move the metric reporting logic into it. This refactoring will make future column metric reporting (such as read PerfContext) much cleaner.	2022-03-23 20:51:49 -07:00
Yueh-Hsuan Chiang	ae75b1a25f	(LedgerStore) Add compression type (#23578 ) This PR adds `--rocksdb-ledger-compression` as a hidden argument to the validator for specifying the compression algorithm for TransactionStatus. Available compression algorithms include `lz4`, `snappy`, `zlib`. The default value is `none`. Experimental results show that with lz4 compression, we can achieve ~37% size-reduction on the TransactionStatus column family, or ~8% size-reduction of the ledger store size.	2022-03-22 02:27:09 -07:00
Yueh-Hsuan Chiang	f999eef452	(LedgerStore) Rename BlockstoreAdvancedOptions to LedgerColumnOptions (#23764 ) This PR renames BlockstoreAdvancedOptions to LedgerColumnOptions, as we will pass-down this struct to LedgerColumn to allow it to perform metric reporting.	2022-03-18 11:13:35 -07:00
Yueh-Hsuan Chiang	86c695268e	(LedgerStore) Improve the function API of new_cf_descriptor (#23696 ) As we start adding more options into BlockstoreOptions, it's better to allow new_cf_descriptor to take the reference to BlockstoreOptions so that we can avoid future function API changes on new_cf_descriptor.	2022-03-16 11:47:49 -07:00
Yueh-Hsuan Chiang	1e20bd8f9a	(LedgerStore) Include storage type as a tag in RocksDB metric reporting (#23523 ) #### Summary of Changes This PR further enables group by operation on storage type in blockstore_rocksdb_cfs metrics. Such group-by allows us to further compare the performance metrics between rocks-level and rocks-fifo. To make things extensible, this PR introduces BlockstoreAdvancedOptions and move shred_storage_type. All fields in BlockstoreAdvancedOptions will support group-by operation in blockstore_rocksdb_cfs. Dependency: #23580	2022-03-11 15:17:34 -08:00
steviez	58c0db9704	Cleanup several blockstore functions (#23390 ) * Rename excludes_from_compaction to should_exclude_from_compaction * Make subfunction to create all cf descriptors * Condense logic for when to disable compactions	2022-03-10 02:08:38 -06:00
Yueh-Hsuan Chiang	b8b7163b66	(Ledger Store) Report RocksDB Column Family Metrics (#22503 ) This PR enables blockstore to periodically report RocksDB column family properties. The reported properties are under blockstore_rocksdb_cfs, and the properties also support group by operation on cf_name.	2022-03-05 16:13:03 -08:00

1 2 3

104 Commits