* switch to new llvm source based coverage
* upload artifact and simplified
* filter out irrelevant dependency coverage
* enable the correct filters on coverage
* correctly specify all binaries
* remove sparse flag from coverage
* update the coverage script organization
* fix typo in coverage script
* WIP: First draft of release criteria for review
* Update release-criteria.md
Fix formatting
* Added more details to release criteria
Co-authored-by: teor <teor@riseup.net>
* Add "Future Releases" section
* Remove Alpha Release criteria items
These should be included and expanded upon in future releases
Co-authored-by: teor <teor@riseup.net>
* Formatting fixes
* Remove support and troubleshooting criteria from first alpha release
* Switching functionality criteria between future and alpha release
* Remove redundant statement from "Network Readiness" section for "Future Releases"
* Go/No-Go checklist should be a living document
Let's make this a living document by making it clear that this reflects the latest status as of the "Last updated" date
* Update release-criteria.md
Update status after Go/No-Go meeting
* Make RAG status symbols more accessible
* Update release-criteria.md
- Remove "Future Releases" section
- Clean up formatting
* Update book/src/dev/release-criteria.md
change "`zebrad` can validate proof of work" from green to amber
Co-authored-by: teor <teor@riseup.net>
* Update book/src/dev/release-criteria.md
Change "Build completes within 30 minutes in Zebra's CI" from green to amber
Co-authored-by: teor <teor@riseup.net>
* Update book/src/dev/release-criteria.md
Change "known panics, errors and warnings have open tickets" and "`zebrad` executes normally" form amber to green
Co-authored-by: teor <teor@riseup.net>
* Rename release-criteria.md to alpha-release-criteria.md
We will have new release criteria for future releases
Co-authored-by: teor <teor@riseup.net>
Zcashd will blindly request more block headers as long as it got 160
block headers in response to a previous query, EVEN IF THOSE HEADERS ARE
ALREADY KNOWN. To dodge this behavior, return slightly fewer than the
maximum, to get it to go away.
0ccc885371/src/main.cpp (L6274-L6280)
Without this change, communication between a partially-synced `zebrad`
and fully-synced `zcashd` looked like this:
1. `zebrad` connects to `zcashd`, which sends an initial `getheaders`
request;
2. `zebrad` correctly computes the intersection of the provided block
locator with the node's current chain and returns 160 following
headers;
3. `zcashd` does not check whether it already has those headers and
assumes that any provided headers are new and re-validates them;
4. `zcashd` assumes that because `zebrad` responded with 160 headers,
the `zebrad` node is ahead of it, and requests the next 160 headers.
5. Because block locators are sparse, the intersection between the
`zcashd` and `zebrad` chains is likely well behind the `zebrad` tip,
so this process continues for thousands of blocks.
To avoid this problem, we return slightly fewer than the protocol
maximum (158 rather than 160, to guard against off-by-one errors in
zcashd). This does not interfere with use of the returned headers by
peers that check the headers, but does prevent `zcashd` from trying to
download thousands of block headers it already has.
This problem does not occur in the `zcashd<->zcashd` case only because
`zcashd` does not respond to `getheaders` messages while it is syncing.
However, implementing this behavior in Zebra would be more complicated,
because we don't have a distinct "initial block sync" state (we do
poll-based syncing continuously) and we don't have shared global
variables to modify to set that state.
Relevant links (thanks @str4d):
- The PR that introduced this behavior: https://github.com/bitcoin/bitcoin/pull/4468/files#r17026905
- https://github.com/bitcoin/bitcoin/issues/6861
- https://github.com/bitcoin/bitcoin/issues/6755
- https://github.com/bitcoin/bitcoin/pull/8306#issuecomment-614916454
We modeled a Bitcoin `headers` message as being a list of block headers.
However, the actual data structure is slightly different: it's a list of (block
header, transaction count) pairs. This caused zcashd to reject our headers
messages.
To fix this, introduce a new `CountedHeader` struct with a `block::Header` and
transaction count `usize`, then thread it through the inbound service and the
state.
I tested this locally by running Zebra with these changes and inspecting a
trace-level log of the span of a peer connection that requested a nontrivial
headers packet from us, and verified that it did not reject our message.
Because the new version of the prometheus exporter launches its own
single-threaded runtime on a dedicated worker thread, there's no need
for the tokio and hyper versions it uses internally to align with the
versions used in other crates. So we don't need to use our fork with
tokio 0.3, and can just use the published alpha. Advancing to a later
alpha may fix the missing-metrics issues.
The cancellation implementation changes made to the connection state machine
mean that if a response oneshot is dropped, the connection will avoid
cancelling the request. So the heartbeat task does have to wait on the response.
Not all reject messages include a data field. This change partially addresses
a problem that could lead to a depleted peer set:
1. We send a response to a `getheaders` message;
2. The remote peer `reject`s our `headers` message for some reason;
3. We fail to parse their `reject` message and close the connection;
4. Repeating this process, we have no more peers.
This commit fixes (3) but does not address (2).
If the limit is less than the ideal, try to increase it to the ideal.
If that doesn't work, try to increase the limit as high as possible.
If the limit is still less than the minimum, panic.
This makes the span data more compact (e.g., `msg_as_req{msg=block}`) and
restores the Debug impl for Message to show all of the data contained in the
message. The full message is added as a single event at trace level in the
span to preserve the previous full-inspectability.
As we approach our alpha release we've decided we want to plan ahead for the user bug reports we will eventually receive. One of the bigger issues we foresee is determining exactly what version of the software users are running, and particularly how easy it may or may not be for users to accidentally discard this information when reporting bugs.
To defend against this, we've decided to include the exact git sha for any given build in the compiled artifact. This information will then be re-exported as a span early in the application startup process, so that all logs and error messages should include the sha as their very first span. We've also added this sha as issue metadata for `color-eyre`'s github issue url auto generation feature, which should make sure that the sha is easily available in bug reports we receive, even in the absence of logs.
Co-authored-by: teor <teor@riseup.net>
* warn: if there are no peers at all
* info: if there are no ready peers
* trace: the number of ready and unready peers for every request
Log at most one warn or info log per minute, to avoid flooding the
terminal with log lines. Suppress warn and info logs for the first
minute, while the peer set is starting up.