cosmos-sdk/store
mergify[bot] 6079fe1888
fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches (backport #10024) (#10370)
* fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches (#10024)

This change takes the observation that previous dbm.IsKeyInDomain
which searches for [start, end) was performing too many byteslice
comparisons. Instead we start off by sorting all the values in the
store.unsortedCache, and then apply a modified binary search to
look for values that fall within the domain [start, end)
The procedure involves:
* iterating over all items to build a list of all keys -- O(n)
* invoking sort.Strings immediately, of which
we anyways eventually invoke sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case
* invoking modified binary search which is O(log(n)) * 2 ~ O(log(n))
to search for the [start, end) range indices

for a total approximate complexity of:
Best case:  O(n) + O(n(log(n))) + O(log(n)) ~= O(nlog(n))
Worst case: O(n) + O(n^2) + O(log(n))       ~= O(n^2)

instead of previously:
* iterating over all the unsorted items and invoking dbm.IsKeyInDomain:
bytes.Compare ~ O(n) + O(n*s*e) where s -- len(start), e -- len(end)
for overall complexity of O(n*s*e)
* invoking sort.Slice(unsorted, ...) which uses
Quicksort -- O(nlog(n)) or O(n^2) worst case

for a total approximate complexity of:
Best case:  O(n) + O(n*s*e) + O(nlog(n)) ~= O(n*s*e) ~ O(n^2)
Worst case: O(n) + O(n*s*e) + O(n^2)     ~= O(n*s*e) ~ O(n^2)

Ordinarily we'd combine the n*s*e to be n*m, but really the comparisons
between (start & key, end & key) are profound that it makes sense to
keep them as factors. The overall benchmark results vindicate our choice
of isolating the factors (n*s*e)

The benchmarks show that as the number of keys to iterate grows, the
new code grows gracefully in a somewhat linear growth, notice for
CAcheKVStoreIterator*, when we go from:
* 1,000 to 10,000 keys: 120us->1,600us (13X) old vs 95us->900us (9.47X) new
* 50,000 to 100,000 keys: 19ms->100ms (5.3X) old vs 5.5ms->17ms (3X) new

```shell
time/op
GetValidator-8	              5.8ms ± 2%    4.7ms ± 1%	-17.69%	(p=0.000 n=10+10)
OneBankSendTxPerBlock-8	      3.2ms ± 2%    2.8ms ± 1%	-10.80%	(p=0.000 n=7+10)
OneBankMultiSendTxPerBlock-8  3.1ms ± 3%    2.9ms ± 2%	-8.36%	(p=0.000 n=10+10)
AccountMapperSetAccount-8     8.6µs ± 1%    7.8µs ± 1%	-9.74%	(p=0.000 n=10+10)
CacheKVStoreIterator500-8     64µs ± 6%	    51µs ± 6%	-19.22%	(p=0.000 n=10+9)
CacheKVStoreIterator1000-8    0.12ms ± 4%   95µs ± 4%	-19.55%	(p=0.000 n=10+10)
CacheKVStoreIterator10000-8   1.6ms ± 4%    0.90ms ± 1%	-42.11%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8   19ms ± 5%	    5.5ms ± 1%	-71.35%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8  0.10s ± 23%   17ms ± 7%	-83.44%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8   1.3µs ± 6%    0.90µs ± 3%	-31.19%	(p=0.000 n=9+9)
CacheKVStoreGetKeyFound-8     0.66µs ± 6%   0.56µs ± 2%	-14.81%	(p=0.000 n=10+9)

alloc/op
B/op
BlockProvision-8	     0.11kB ± 0%    0.10kB ± 0%	-7.14%	(p=0.000 n=10+10)
CacheKVStoreIterator50000-8  0.89MB ± 6%    0.53MB ± 1%	-40.85%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 6.3MB ± 23%    1.6MB ± 6%	-74.17%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  0.26kB ± 0%    0.23kB ± 1%	-11.53%	(p=0.000 n=10+8)

allocs/op (count)
AccountMapperSetAccount-8    42 ± 0%	    38 ± 0%	-9.52%	(p=0.000 n=10+10)
BlockProvision-8	     6.0 ± 0%	    5.0 ± 0%	-16.67%	(p=0.000 n=10+10)
CacheKVStoreIterator1000-8   14 ± 0%	    13 ± 0%	-7.14%	(p=0.002 n=8+10)
CacheKVStoreIterator10000-8  0.15k ± 2%	    76 ± 1%	-49.00%	(p=0.000 n=7+10)
CacheKVStoreIterator50000-8  8.9k ± 11%	    2.0k ± 2%	-77.60%	(p=0.000 n=10+10)
CacheKVStoreIterator100000-8 0.10M ± 26%    13k ± 12%	-86.89%	(p=0.000 n=10+10)
CacheKVStoreGetNoKeyFound-8  5.0 ± 0%	    4.0 ± 0%	-20.00%	(p=0.000 n=10+10)
```

Note: Purposefully using a commit off master that doesn't
include the buggy code that caused x/bank.BenchmarkOneBank* to fail
per issue https://github.com/cosmos/cosmos-sdk/issues/10023

Updates #9876

/cc @cuonglm @kirbyquerby

<!--
The default pull request template is for types feat, fix, or refactor.
For other templates, add one of the following parameters to the url:
- template=docs.md
- template=other.md
-->

## Description

Closes: #XXXX

<!-- Add a description of the changes that this PR introduces and the files that
are the most critical to review. -->

---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)

(cherry picked from commit 3c85944061)

# Conflicts:
#	CHANGELOG.md

* fix conflict

Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com>
Co-authored-by: marbar3778 <marbar3778@yahoo.com>
Co-authored-by: Robert Zaremba <robert@zaremba.ch>
2021-10-15 19:46:41 +02:00
..
cache all: ensure b.ReportAllocs() in all the benchmarks (#8460) 2021-01-27 23:52:08 -08:00
cachekv fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches (backport #10024) (#10370) 2021-10-15 19:46:41 +02:00
cachemulti fix: add missing nil check in store.GetStore (#9354) 2021-05-19 08:17:46 +00:00
dbadapter ADR-038 Part 1: WriteListener, listen.KVStore, MultiStore and KVStore updates (#8551) 2021-03-30 16:13:51 -04:00
gaskv perf: Remove telemetry from wrappings of store (backport #10077) (#10170) 2021-09-17 02:07:52 +02:00
iavl ADR-038 Part 1: WriteListener, listen.KVStore, MultiStore and KVStore updates (#8551) 2021-03-30 16:13:51 -04:00
internal store/internal: validate keys before calling ProofsFromMap (#9235) 2021-05-02 15:53:59 -07:00
listenkv codec: Rename codec and marshaler interfaces (#9226) 2021-04-29 10:46:22 +00:00
mem ADR-038 Part 1: WriteListener, listen.KVStore, MultiStore and KVStore updates (#8551) 2021-03-30 16:13:51 -04:00
prefix ADR-038 Part 1: WriteListener, listen.KVStore, MultiStore and KVStore updates (#8551) 2021-03-30 16:13:51 -04:00
rootmulti fix: removed potential sources of non-determinism in upgrades (backport #10189) (#10253) 2021-09-29 11:05:51 +02:00
tracekv ADR-038 Part 1: WriteListener, listen.KVStore, MultiStore and KVStore updates (#8551) 2021-03-30 16:13:51 -04:00
transient Merge PR #7265: Tendermint Block Pruning 2020-09-14 10:12:49 -04:00
types refactor: update default pruning strategy (#9859) (#9863) 2021-08-06 11:54:51 -04:00
README.md chore: add markdownlint to lint commands (#9353) 2021-05-27 15:31:04 +00:00
firstlast.go types: add kv type (#6897) 2020-07-30 14:53:02 +00:00
reexport.go Merge PR #6475: Pruning Refactor 2020-06-22 16:31:33 -04:00
store.go Merge PR #6475: Pruning Refactor 2020-06-22 16:31:33 -04:00

README.md

Store

CacheKV

cachekv.Store is a wrapper KVStore which provides buffered writing / cached reading functionalities over the underlying KVStore.

type Store struct {
    cache map[string]cValue
    parent types.KVStore
}

Get

Store.Get() checks Store.cache first in order to find if there is any cached value associated with the key. If the value exists, the function returns it. If not, the function calls Store.parent.Get(), sets the key-value pair to the Store.cache, and returns it.

Set

Store.Set() sets the key-value pair to the Store.cache. cValue has the field dirty bool which indicates whether the cached value is different from the underlying value. When Store.Set() cache new pair, the cValue.dirty is set true so when Store.Write() is called it can be written to the underlying store.

Iterator

Store.Iterator() have to traverse on both caches items and the original items. In Store.iterator(), two iterators are generated for each of them, and merged. memIterator is essentially a slice of the KVPairs, used for cached items. mergeIterator is a combination of two iterators, where traverse happens ordered on both iterators.

CacheMulti

cachemulti.Store is a wrapper MultiStore which provides buffered writing / cached reading functionalities over the underlying MutliStore

type Store struct {
    db types.CacheKVStore
    stores map[types.StoreKey] types.CacheWrap
}

cachemulti.Store branches all substores in its constructor and hold them in Store.stores. Store.GetKVStore() returns the store from Store.stores, and Store.Write() recursively calls CacheWrap.Write() on the substores.

DBAdapter

dbadapter.Store is a adapter for dbm.DB making it fulfilling the KVStore interface.

type Store struct {
    dbm.DB
}

dbadapter.Store embeds dbm.DB, so most of the KVStore interface functions are implemented. The other functions(mostly miscellaneous) are manually implemented.

IAVL

iavl.Store is a base-layer self-balancing merkle tree. It is guaranteed that

  1. Get & set operations are O(log n), where n is the number of elements in the tree
  2. Iteration efficiently returns the sorted elements within the range
  3. Each tree version is immutable and can be retrieved even after a commit(depending on the pruning settings)

Specification and implementation of IAVL tree can be found in [https://github.com/tendermint/iavl].

GasKV

gaskv.Store is a wrapper KVStore which provides gas consuming functionalities over the underlying KVStore.

type Store struct {
    gasMeter types.GasMeter
    gasConfig types.GasConfig
    parent types.KVStore
}

When each KVStore methods are called, gaskv.Store automatically consumes appropriate amount of gas depending on the Store.gasConfig.

Prefix

prefix.Store is a wrapper KVStore which provides automatic key-prefixing functionalities over the underlying KVStore.

type Store struct {
    parent types.KVStore
    prefix []byte
}

When Store.{Get, Set}() is called, the store forwards the call to its parent, with the key prefixed with the Store.prefix.

When Store.Iterator() is called, it does not simply prefix the Store.prefix, since it does not work as intended. In that case, some of the elements are traversed even they are not starting with the prefix.

RootMulti

rootmulti.Store is a base-layer MultiStore where multiple KVStore can be mounted on it and retrieved via object-capability keys. The keys are memory addresses, so it is impossible to forge the key unless an object is a valid owner(or a receiver) of the key, according to the object capability principles.

TraceKV

tracekv.Store is a wrapper KVStore which provides operation tracing functionalities over the underlying KVStore.

type Store struct {
    parent types.KVStore
    writer io.Writer
    context types.TraceContext
}

When each KVStore methods are called, tracekv.Store automatically logs traceOperation to the Store.writer.

type traceOperation struct {
    Operation operation
    Key string
    Value string
    Metadata map[string]interface{}
}

traceOperation.Metadata is filled with Store.context when it is not nil. TraceContext is a map[string]interface{}.

Transient

transient.Store is a base-layer KVStore which is automatically discarded at the end of the block.

type Store struct {
    dbadapter.Store
}

Store.Store is a dbadapter.Store with a dbm.NewMemDB(). All KVStore methods are reused. When Store.Commit() is called, new dbadapter.Store is assigned, discarding previous reference and making it garbage collected.