cosmos-sdk/store/cachekv
mergify[bot] 14cd943a2e
perf: store/cachekv: avoid a map lookup if unnecessary, clear maps fast, avoid keys sort (backport #10486) (#10848)
* perf: store/cachekv: avoid a map lookup if unnecessary, clear maps fast (#10486)

We can shave off some milliseconds, but also cut down some Megabytes of
RAM consumed by only requesting from the cache if needed, but also using
the map clearing idiom which is recognized by the compiler to make fast
code.

Noticed in profiles from Tharsis' Ethermint per https://github.com/tharsis/ethermint/issues/710

- Before
* Memory profiles
```shell
   19.50MB    19.50MB    134:	store.cache = make(map[string]*cValue)
   18.50MB    18.50MB    135:	store.deleted = make(map[string]struct{})
   15.50MB    15.50MB    136:	store.unsortedCache = make(map[string]struct{})
```

* CPU profiles
```go
         .          .    118:	// TODO: Consider allowing usage of Batch, which would allow the write to
         .          .    119:	// at least happen atomically.
     150ms      150ms    120:	for _, key := range keys {
     220ms      3.64s    121:		cacheValue := store.cache[key]
         .          .    122:
         .          .    123:		switch {
         .      250ms    124:		case store.isDeleted(key):
         .          .    125:			store.parent.Delete([]byte(key))
     210ms      210ms    126:		case cacheValue.value == nil:
         .          .    127:			// Skip, it already doesn't exist in parent.
         .          .    128:		default:
     240ms     27.94s    129:			store.parent.Set([]byte(key), cacheValue.value)
         .          .    130:		}
         .          .    131:	}

...

      10ms       60ms    134:	store.cache = make(map[string]*cValue)
         .       40ms    135:	store.deleted = make(map[string]struct{})
         .       50ms    136:	store.unsortedCache = make(map[string]struct{})
         .      110ms    137:	store.sortedCache = dbm.NewMemDB()
```

- After
* Memory profiles
```shell
         .          .    130:	// Clear the cache using the map clearing idiom
         .          .    131:	// and not allocating fresh objects.
         .          .    132:	// Please see https://bencher.orijtech.com/perfclinic/mapclearing/
         .          .    133:	for key := range store.cache {
         .          .    134:		delete(store.cache, key)
         .          .    135:	}
         .          .    136:	for key := range store.deleted {
         .          .    137:		delete(store.deleted, key)
         .          .    138:	}
         .          .    139:	for key := range store.unsortedCache {
         .          .    140:		delete(store.unsortedCache, key)
         .          .    141:	}
```

* CPU profiles
```shell
         .          .    111:	// TODO: Consider allowing usage of Batch, which would allow the write to
         .          .    112:	// at least happen atomically.
     110ms      110ms    113:	for _, key := range keys {
         .      210ms    114:		if store.isDeleted(key) {
         .          .    115:			// We use []byte(key) instead of conv.UnsafeStrToBytes because we cannot
         .          .    116:			// be sure if the underlying store might do a save with the byteslice or
         .          .    117:			// not. Once we get confirmation that .Delete is guaranteed not to
         .          .    118:			// save the byteslice, then we can assume only a read-only copy is sufficient.
         .          .    119:			store.parent.Delete([]byte(key))
         .          .    120:			continue
         .          .    121:		}
         .          .    122:
      50ms      2.45s    123:		cacheValue := store.cache[key]
     910ms      920ms    124:		if cacheValue.value != nil {
         .          .    125:			// It already exists in the parent, hence delete it.
     120ms     29.56s    126:			store.parent.Set([]byte(key), cacheValue.value)
         .          .    127:		}
         .          .    128:	}
         .          .    129:
         .          .    130:	// Clear the cache using the map clearing idiom
         .          .    131:	// and not allocating fresh objects.
         .          .    132:	// Please see https://bencher.orijtech.com/perfclinic/mapclearing/
         .      210ms    133:	for key := range store.cache {
         .          .    134:		delete(store.cache, key)
         .          .    135:	}
         .       10ms    136:	for key := range store.deleted {
         .          .    137:		delete(store.deleted, key)
         .          .    138:	}
         .      170ms    139:	for key := range store.unsortedCache {
         .          .    140:		delete(store.unsortedCache, key)
         .          .    141:	}
         .      260ms    142:	store.sortedCache = dbm.NewMemDB()
         .       10ms    143:}

```

Fixes #10487
Updates https://github.com/tharsis/ethermint/issues/710

(cherry picked from commit 5399e72f32)

# Conflicts:
#	CHANGELOG.md

* fix conflicts

Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com>
Co-authored-by: Aleksandr Bezobchuk <aleks.bezobchuk@gmail.com>
2022-01-03 14:40:15 +01:00
..
bench_helper_test.go perf: Speedup cachekv iterator on large deletions & IBC v2 upgrade logic (backport #10741) (#10744) 2021-12-13 11:57:25 +01:00
memiterator.go perf: Speedup cachekv iterator on large deletions & IBC v2 upgrade logic (backport #10741) (#10744) 2021-12-13 11:57:25 +01:00
mergeiterator.go tendermint: update to rc3 (#6892) 2020-08-14 13:58:53 -04:00
search_test.go fix!: store/cachekv: reduce growth factor for iterator ranging using binary searches (backport #10024) (#10370) 2021-10-15 19:46:41 +02:00
store.go perf: store/cachekv: avoid a map lookup if unnecessary, clear maps fast, avoid keys sort (backport #10486) (#10848) 2022-01-03 14:40:15 +01:00
store_bench_test.go perf: Speedup cachekv iterator on large deletions & IBC v2 upgrade logic (backport #10741) (#10744) 2021-12-13 11:57:25 +01:00
store_test.go all: ensure b.ReportAllocs() in all the benchmarks (#8460) 2021-01-27 23:52:08 -08:00