doc(ci): Explain how to fix Mergify and Zcash parameter download failures (#5240)

* Explain how to find Mergify failures

* Explain how to fix cache errors

* Fix instructions - clear all caches

* Fix which errors need which actions

* Add a newline to appease GitHub markdown renderer

Co-authored-by: Arya <aryasolhi@gmail.com>

Co-authored-by: Arya <aryasolhi@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
This commit is contained in:
teor 2022-09-27 11:32:55 +10:00 committed by GitHub
parent c2b00c2fe1
commit 9a2814a1b2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 23 additions and 10 deletions

View File

@ -72,7 +72,10 @@ This means that the entire workflow must be re-run when a single test fails.
### Finding Errors
Look for the earliest job that failed, and find the earliest failure.
0. Check if the same failure is happening on the `main` branch or multiple PRs.
If it is, open a ticket and tell the Zebra team lead.
1. Look for the earliest job that failed, and find the earliest failure.
For example, this failure doesn't tell us what actually went wrong:
> Error: The template is not valid. ZcashFoundation/zebra/.github/workflows/build-docker-image.yml@8bbc5b21c97fafc83b70fbe7f3b5e9d0ffa19593 (Line: 52, Col: 19): Error reading JToken from JsonReader. Path '', line 0, position 0.
@ -85,8 +88,11 @@ But the specific failure is a few steps earlier:
https://github.com/ZcashFoundation/zebra/runs/8181760421?check_suite_focus=true#step:8:2112
The earlier failure can also be in another job, check out the whole workflow run for details.
(Use the "Summary" button on the top left of the job details, and zoom in.)
2. The earliest failure can also be in another job or pull request:
a. check the whole workflow run (use the "Summary" button on the top left of the job details, and zoom in)
b. if Mergify failed with "The pull request embarked with main cannot be merged", look at the PR "Conversation" tab, and find the latest Mergify PR that tried to merge this PR. Then start again from step 1.
3. If that doesn't help, try looking for the latest failure. In Rust tests, the "failure:" notice contains the failed test names.
### Fixing CI Sync Timeouts
@ -146,19 +152,26 @@ To fix duplicate dependencies, follow these steps until the duplicate dependenci
4. Repeat step 3 until the dependency warnings are fixed. Adding a single `skip-tree` exception can resolve multiple warnings.
### Fixing Disk Full Errors
### Fixing Disk Full Errors and Zcash Parameter Errors
If the Docker cached state disks are full, increase the disk sizes in:
- [deploy-gcp-tests.yml](https://github.com/ZcashFoundation/zebra/blob/main/.github/workflows/deploy-gcp-tests.yml)
- [continous-delivery.yml](https://github.com/ZcashFoundation/zebra/blob/main/.github/workflows/continous-delivery.yml)
If the GitHub Actions disks are full, follow these steps until the errors are fixed:
1. Update your branch to the latest `main` branch, this builds with all the latest dependencies in the `main` branch cache
2. Clear the GitHub Actions cache for the failing branch
3. Clear the GitHub Actions caches for all the branches and the `main` branch
If the GitHub Actions disks are full, or the Zcash parameter downloads time out without any network messages or errors,
follow these steps until the errors are fixed:
If the `*-sprout-and-sapling-params` caches are around 765 MB, they are the correct size.
There is no need to clear them, the replacement cache will be the same size.
0. Check if error is also happening on the `main` branch. If it is, skip the next step.
1. Update your branch to the latest `main` branch, this builds with all the latest dependencies in the `main` branch cache.
2. Clear the GitHub Actions code cache for the failing branch. Code caches are named after the compiler version.
3. Clear the GitHub Actions code caches for all the branches and the `main` branch.
These errors often happen after a new compiler version is released, because the caches can end up with files from both compiler versions.
If the Zcash Parameter downloads have an error loading the parameters:
1. Clear the Zcash parameter caches for all branches, including `main`
The correct `*-sprout-and-sapling-params` caches should be around 765 MB.
You can find a list of caches using:
```sh