* Node: Processor db write separation
* Handle additional update while writing to db
* Move the broadcasting of signed VAA to the worker
* Tweak signaturesToVaaFormat
* Eliminate map look up in HandleQuorum
* Remove unnecessary check for already submitted
* Use BadgerDB batch API to store VAAs
* Don't move broadcasting to worker
* Speed up processing our own observation
* Simplify handleMessage and broadastSignature
* Code review rework
* Node: Reduce info logging
Change-Id: I1ad80304a59ccd50e675765ef1f648be02e0d7ce
* Node: Remove a couple of more info logs
Change-Id: I7944446b73b140f4a8fbae21dee5baa9e9c5d9d0
We currently run the cleanup loop every 30 seconds, which means that
once 5 minutes have passed for an observation without quorum we will
send out re-observation requests to the p2p network every 30 seconds.
This is a bit excessive so limit sending these requests out to once
every 5 minutes.
When processing pending observations, the `handleCleanup` function
checks if we already have a stored quorum VAA and deletes the in-memory
observation if one is found. If a stored quorum VAA is not found, then
we're supposed to continue evaluating the other conditions and take
the appropriate action. This was implemented using a `fallthrough`
statement.
Unfortunately, this is not how `fallthrough` works. `fallthrough`
simply tells the compiler to execute the body of the next branch in
the switch block, *without evaluating the condition*. It also doesn't
evaluate the conditions for any of the other branches in the switch.
In practice what this meant is that for local observations that didn't
have quorum we would always take the first branch, fall through to the
second branch, and then exit the switch. Only once we had a quorum
(`s.submitted == true`) would we actually consider any of the other
branches in the switch. It also meant that there was no case where we
would take the branch for re-observing messages that hadn't reached
quorum.
Fix this by moving the stored quorum VAA check into an if statement and
then falling through to the switch statement if one is not found.
The wormhole sdk is a new go module in the sdk/ directory. This
initially contains the *_consts.go files from the common package in the
top-level sdk package and the entire vaa package as a sub-package.
For go reasons this needs to be in the sdk directory itself (rather than
a sdk/go subdir). To prevent the go tooling from looking into the other
non-go subdirs, add an empty go.mod file in each one. See
golang issue 42965 for more details on why we can't have nice
things (I'm deliberately not linking to stop github from spamming that
issue).
Currently if an observation hasn't reached quorum within 5 minutes, the
processor will re-broadcast the signed local observation to the other
guardians in the network. However if not enough guardians actually
observed the original tx, then no amount of re-broadcasting will help
the network reach quorum.
Fix this issue by sending a re-observation request whenever we
re-broadcast a signed local observation. This ensures that any
guardians that missed the tx the first time it happened have a chance to
re-observe it and help the network reach quorum.
We need to reuse almost all of the gossip infrastructure for accounting
transactions, with the only difference being that accounting will use a
`Transfer` message rather than a `VAA`.
Make the observation stored in the processor state generic so that it
can be either a VAA or a Transfer. The rest of the code is shared.
Fixes https://github.com/certusone/wormhole/issues/685.
Example occurrence this fixes: https://i.imgur.com/gZWKf1n.png
Possible future optimizations include:
- Ignore late messages in the processor (but we can only ignore
them post settlement time, so we need the cleanup logic regardless).
- Ignoring late observations from other nodes.
- Using the stored VAA to calculate misses.
- Drop incomplete local observations. However, this is not trivial
since we do not know the message ID for those.
commit-id:47e1e59f
This avoids gossip spam and false positive Discord notifications
when a connected node catches up and late observations are made.
Change-Id: If9562661487d3d3d5138d27298b005f278f9e9ce
In cases where we observed a VAA, there is no possibility of gossip DoS.
Increase the timeout to 24 hours to facilitate manual interventions
(like submission of governance VAAs or node restarts/catchup).
Keep the existing five minute timeout for observation-less VAAs.
Change-Id: Ic626108190bd60cf812daadbe191b31cc48c7296
rustfmt appears to be a little more complicated since it wants to
download dependencies and needs nightly Rust.
Change-Id: Ia348def30a6459ae2ab6c29a8c3a413216f5eb4b