* Initialize the rayon threadpool with a new config for CPU-bound threads
* Verify proofs and signatures on the rayon thread pool
* Only spawn one concurrent batch per verifier, for now
* Allow tower-batch to queue multiple batches
* Fix up a potentially incorrect comment
* Rename some variables for concurrent batches
* Spawn multiple batches concurrently, without any limits
* Simplify batch worker loop using OptionFuture
* Clear pending batches once they finish
* Stop accepting new items when we're at the concurrent batch limit
* Fail queued requests on drop
* Move pending_items and the batch timer into the worker struct
* Add worker fields to batch trace logs
* Run docker tests on PR series
* During full verification, process 20 blocks concurrently
* Remove an outdated comment about yielding to other tasks
* Make the release checklist shorter and hide some details
* Ignore any `fastmod` updates to previous release notes in `CHANGELOG.md`
* Use recent versions in examples
* Fix markdown that doesn't render correctly
* Fix some weird line breaks
* Use capital letters to start list items
* Clarify `fastmod` and `CHANGELOG.md`
* Clarify version format by changing highlighting
* Checkout zebra in each job to avoid warnings
But put TODOs where we might be able to skip checkouts
* Split log following into sprout checkpoints, sapling/orchard checkpoints, and full validation
* Make job IDs shorter
* Use /dev/stderr because docker doesn't have a tty
* remove pipefail
* Revert "remove pipefail"
This reverts commit a7ee37bebdc107a4215e7dd307b189d925969234.
* Make tee ignore errors writing to a grep pipe
* Avoid launching multiple docker instances for duplicate jobs
* Ignore broken pipe error messages and statuses
* fix(ci): docker wait not finding container
We had this issue before, I can't recall if this was a parsing error between GitHub Actions and gcloud `--command` parsing, but we had to change this into two pieces.
This implementation keeps it how we did it before 9b9578c999/.github/workflows/test.yml (L235-L243)
* docs: remove pending TODO
We can't remove `actions/checkout` nor set `create_credentials_file` to `false` as next steps won't be able to authenticate to GCP.
We can surely remove `actions/checkout` and leave `create_credentials_file` as `true`, but this will raise a warning on each step, and there's no benefit of doing so.
* Show `docker wait` and `gcloud ssh` output
* If `docker wait` fails, get the exit code using `docker inspect`
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* Put arguments to "docker run" on different lines
And update some comments.
* Split docker run into launch, logs, and wait
* Remove mistaken "needs state" condition on log and results job
* Exit the ssh and the job with the container test's exit status
* Split full sync into checkpoint and full validation
* Sort workflow variables into categories and add descriptions
* Split Create instance/volume and Run test into separate jobs
* Copy initial conditions to all jobs in the series
* Actually create a cached state image
* fix(state): use same disk naming convention for all test instances
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* feat(ci): build each crate individually
* fix(ci): use valid names for each job
* feat(ci): builds and checks with and without all features
* refactor(ci): build job matrix dinamically
* fix: use a JSON_CRATES variable with resulting values
* test: check-matrix
* fix(ci): use "crate" in singular for reference
* imp(ci): use a matrix for feature build arguments
* fix(ci): use correct naming and includes
* fix(ci): implement most recommendations given in review
* fix(ci): use simpler shell script
* fix: typo
* fix: add string to file, not cmd
* fix: some shellchecks
* fix(ci): remove warnings and errors from shellcheck
* imp(ci): add patch file for `Build crates individually` workflow
* Remove unused configs in patch job
Co-authored-by: teor <teor@riseup.net>
* feat(actions): delete old GCP resources
* fix(ci): delete old instances templates
* fix(actions): use correct date arguments and conversion
* fix(actions): missing command in gcloud
* fix(gcp): if an instance can't be deleted, continue
* refacor(action): cleanup and execute monthly
* increase lightwalletd timeout
* switch back to aditya's fork
* manually point to new aditya's lightwalletd image
* disable sync_one_checkpoint_testnet test
* disable restart_stop_at_height in testnet
* rever to 'latest' lightwalletd image
* Remove a duplicate lightwalletd error message
* Reactivate some error messages that have been fixed
* Fix confusing lightwalletd cached state path logs
* Add the gRPC tests to the lightwalletd test suite function
* Make test regexes compatible with zcash/lightwalletd
* Add logging to gRPC tests
* Switch to zcash/lightwalletd for testing
* Upgrade tracing and related dependencies
```sh
cargo upgrade --workspace
tracing-error
tracing-subscrber
color-eyre
tracing-flame
tracing-journald
sentry
sentry-tracing
metrics
metrics-exporter-prometheus
reqwest
```
* Update duplicate dependency checks
* Enable the tracing/env-filter feature
* Fix type inference for metrics
Manual changes, plus:
```sh
fastmod "as _" "as f64"
```
* Tidy up some unrelated test code
* Update metrics-exporter-prometheus API
And make unused dependencies optional.
* Adjust test regexes to new tracing format
Also fix some regex bugs, and refactor to simplify.
* Disable color-eyre span traces and track caller in release builds
* Add a feature that enables extra debugging in release builds
* Clean up some redundant features
* Increase a test timeout
* refactor(ci): keep tests jobs under the 6 hour timeout
When running a full sync or any other test which takes almost 5 hours, having those jobs running with other actions that might take several minutes, also reduces the overall time from the job_id.
We use a separate job for image creation and deletion to handle this cases.
* fix(ci): instance deletion can't run on non finished tests
* fix(ci): tests without a cached state might save to disk
* fix(ci): ignore failures when deleting an instance
* fix(ci): remove delete step `needs` redundancy
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* feat(ci): add a codespell linting action
* fix(ci): run this job if the lint workflow is changed
* ci(codespell): add configuration file
* ci(codespell): exclude mermaid.min.js
* fix: wrong mermaid.min.js location
* ci(codespell): Sur from "Big Sur" is being considered as misspelled
* ci(codespell): make warning the max level
This won't restrict PRs from merging
* ci(codespell): lint on every push
* test: create a misspelling
* Revert "test: create a misspelling"
This reverts commit a2c91cda1e.
* fix(ci): allow for the lightwalletd-full-sync to mount the lwd-cache dir
* fix(ci): compare with a string
* imp(ci): run a lightwalletd tip if there's no lwd tip disk available
* docs(ci): add TODO explaining this is a temporal condition
* change(doc): add item to release checklist to update dependencies in the README
* Update .github/PULL_REQUEST_TEMPLATE/release-checklist.md
Co-authored-by: Marek <mail@marek.onl>
Co-authored-by: Marek <mail@marek.onl>
* Require a cached state rebuild if the state version changes
* Find cached state disks with the same state version
And prefer `main` to other branches.
* Tweak filters to make them more specific
* Try adding inner quotes
* Try brackets instead
* Try two filters, rather than three
* Use Mainnet as the default network, remove duplicate env var
* Match the exact disk name format in one regular expression
* Log the exact expected disk name, including the network
* Consistently use CACHED_DISK_NAME as the env var name
* Temporary allow missing $NETWORK in disk names
* Print the exact search string
* Debug log the search string
* Use a generic alphabetical pattern rather than a regex group
Google Cloud doesn't seem to support regex groups.
* Add network name to disk match docs
* Fix the logged network name
* Make jobs that use cached state wait for state rebuilds
* Run jobs that need cached state even if the rebuild was skipped
* Fix missing dependencies
And update a TODO
* Revert "Use a generic alphabetical pattern rather than a regex group"
This reverts commit 970afe7b17.
* Revert "Temporary allow missing $NETWORK in disk names"
This reverts commit f1f66500c3.
* Make jobs that use cached state wait for state rebuilds
* Run jobs that need cached state even if the rebuild was skipped
* Fix missing dependencies
And update a TODO
* refactor(ci): look for available disks instead of files changed
This ensure that if the constants.rs file was changed, we search for disks available in the whole repository with the same state.
If there's no disk available a rebuild is triggered depending the missing disk. And if there's a disk available, tests are run with this one.
* fix(ci): lwd syncs needs to wait for zebra disk rebuild
* docs(ci): use better comments on integration tests
* fix(ci): we must authenticate to GCP to find disks
* fix(ci): add needed permissions for google auth
* fix(ci): the output needs to be echoed
* imp(ci): reduce diff with main
* fix(ci): remove redundant dependency
Co-authored-by: teor <teor@riseup.net>
* fix(ci): also add `false` to the JSON object output
* fix(ci): hasty copy/paste
* fix(ci): standardize comments
* fix(ci): run disk rebuilds if no disk was found
* fix(ci): build on any event if a cached disk is not found
* fix(ci): reduce diff with main
* docs(ci): reduce main diff
* fix(ci): sync .patch file with changes on the workflow
* fix(ci): consider network changes in new get-available-disks
* force GHA trigger
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* Make jobs that use cached state wait for state rebuilds
* Run jobs that need cached state even if the rebuild was skipped
* Fix missing dependencies
And update a TODO
* refactor(ci): look for available disks instead of files changed
This ensure that if the constants.rs file was changed, we search for disks available in the whole repository with the same state.
If there's no disk available a rebuild is triggered depending the missing disk. And if there's a disk available, tests are run with this one.
* fix(ci): lwd syncs needs to wait for zebra disk rebuild
* docs(ci): use better comments on integration tests
* fix(ci): we must authenticate to GCP to find disks
* fix(ci): add needed permissions for google auth
* fix(ci): the output needs to be echoed
* imp(ci): reduce diff with main
* fix(ci): remove redundant dependency
Co-authored-by: teor <teor@riseup.net>
* fix(ci): also add `false` to the JSON object output
* fix(ci): hasty copy/paste
* fix(ci): standardize comments
* fix(ci): run disk rebuilds if no disk was found
* fix(ci): build on any event if a cached disk is not found
* fix(ci): sync .patch file with changes on the workflow
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* Revert "Temporarily stop requiring cached lightwalletd state for the send transaction tests"
This reverts commit f6b29b151e.
* fix(ci): add a lightwalletd cached state to the test
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* fix(ci): lwd state condition
* fix(ci): differentiate tests that need a lwd cached state
* fix(ci): use the right state and save name for each test
* docs(ci): minor comment fixes
* docs(ci): better input description
* fix(ci): end `if` condition correctly
* fix(images): pass the state version to following steps
* fix(ci): $needs_lwd_state condition was inverted
* fix(ci): reduce disk selection code
* docs(ci): better disk search conditional explanation
* fix(ci): end if condition correctly
* fix(ci): evaluate $needs_zebra_state correctly
* fix(ci): use nested condition for readability
* fix(ci): disk search was using the wrong variable
* Temporarily use an earlier lightwalletd version
This checks if commit
e146dbf5c2
contains a mempool refresh deadlock bug.
* Actually rebuild the lightwalletd image
* Delete an unfinished comment
* Remove duplicate test in entrypoint.sh
* Keep a recent change to make tests consistent
* fix(ci): remove not used variable `lwd_state_dir`
* fix(ci): state wast not being added to the image name
* fix(ci): mount a docker volume with lightwalletd dir
If the volume doesn't mount this lwd cached state dir, the content won't be saved to the mounted disk in the VM
* fix(ci): lwd state condition
* docs(ci): explain disk mounting logic
* docs(ci): explain disk mounting decision better
* docs(ci): add a description for confusing input names
Co-authored-by: teor <teor@riseup.net>
* Require a cached state rebuild if the state version changes
* Find cached state disks with the same state version
And prefer `main` to other branches.
* Tweak filters to make them more specific
* Try adding inner quotes
* Try brackets instead
* Try two filters, rather than three
* Use Mainnet as the default network, remove duplicate env var
* Match the exact disk name format in one regular expression
* Log the exact expected disk name, including the network
* Consistently use CACHED_DISK_NAME as the env var name
* Temporary allow missing $NETWORK in disk names
* Print the exact search string
* Debug log the search string
* Use a generic alphabetical pattern rather than a regex group
Google Cloud doesn't seem to support regex groups.
* Add network name to disk match docs
* Fix the logged network name
* imp(ci): remove gcp verbose log
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* fix(ci): sentry is not longer being activated in test builds
This removes sentry from all the test execution, as some tests might fail as sentry wasn't initially built, or it might take more time to build as it will have to build with sentry.
* fix(build): workaround the failed to fetch oauth token error
* Drop sentry dependencies when enable-sentry feature is disabled
* Make lightwalletd gRPC tests depend on a new lightwalletd-grpc-tests feature
* fix(ci): remove enable-sentry feature from tests
* Add lightwalletd-grpc-tests feature for functionality or efficiency
And document where it is just used to stop re-compilations.
* Remove redundant `cmake` and `protobuf-compiler` dependencies
* Document Zebra's optional production and test feature flags
* Minimise dependencies in zcash-params/Dockerfile
* Minimise dependencies in docker/Dockerfile
* Add a workflow TODO
* Catch more errors in entrypoint.sh
Also makes entrypoint.sh compatible with more distributions
* Remove unnecessary quoting in entrypoint.sh
* Use exactly the same arguments to call CI tests
* Remove a redundant CI build
* Rename Cargo.lock check job
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* fix(ci): sentry is not longer being activated in test builds
This removes sentry from all the test execution, as some tests might fail as sentry wasn't initially built, or it might take more time to build as it will have to build with sentry.
* fix(build): workaround the failed to fetch oauth token error
* Drop sentry dependencies when enable-sentry feature is disabled
* Make lightwalletd gRPC tests depend on a new lightwalletd-grpc-tests feature
* fix(ci): remove enable-sentry feature from tests
* Add lightwalletd-grpc-tests feature for functionality or efficiency
And document where it is just used to stop re-compilations.
* Remove redundant `cmake` and `protobuf-compiler` dependencies
* Document Zebra's optional production and test feature flags
* Minimise dependencies in zcash-params/Dockerfile
* Minimise dependencies in docker/Dockerfile
* Add a workflow TODO
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* fix(ci): sentry is not longer being activated in test builds
This removes sentry from all the test execution, as some tests might fail as sentry wasn't initially built, or it might take more time to build as it will have to build with sentry.
* fix(build): workaround the failed to fetch oauth token error
* refactor(build): use better stage naming and document it
* refactort(build): use multiple cache sources
* docs(build): add a comment for cache
* fix(build): remove gcr.io as it does not supports OCI images
* feat(ci): add lightwalletd_*_sync tests to CI
* feat(ci): add lightwalletd RPC call test
* feat(ci): add send transactions test with lwd to CI
* fix(ci): create a variable to run transactions test
* refactor(ci): use docker in docker
This is a workaround for an issue related to disk partitioning, caused by a GCP service called Konlet, while mounting the cached disks to the VM and then to the container
* fix(build): persist docker login credentials
* fix(ci): get sync height from docker logs instead of gcp
* try: use gha cache for faster building
* fix(ci): mount disk in container to make it available in vm
* fix(build): do not invalidate cache between images
* try(docker): invalidate cache as less as possible
* fix(ci): GHA terminal is not a TTY
* fix(build): do not ignore entrypoint.sh
* fix
* fix(ci): mount using root priveleges
* fix(ci): use existing disk as cached state
* fix(ci): wait for disks to get mounted
* force rebuild
* fix failed force
* fix failed commit
* WIP
* fix(ci): some tests does not use a cached state
* wip
* refactor(ci): disk names and job segregation
* fix(ci): do not name boot and attached disk the same
* fix(ci): attach a disk to full sync, to snapshot the state
* fix(ci): use correct disk implementations
* fix(ci): use different disk name to allow test concurrency
* feat(ci): add lightwalledt send transaction test
* cleanup(ci): remove extra tests
* fix(ci): allow disk concurrency with tests
* fix(ci): add considerations for different tests
* fix(reusable): last fixes
* feat(ci): use reusable workflow for tests
* fix(rw): remove nested worflow
* fix(rw): minor fixes
* force rebuild
* fix(rw): do not use an input as job name
* fix(rw): remove variable id
* fix(ci): remove explicit conditions and id
* fix(ci): docker does not need the variable sign ($) to work
* fix(ci): mount typo
* fix(ci): if a sync fails, always delete the instance
This also reduces the amount of jobs needed.
* refactor(ci): make all test depend on the same build
* fix(ci): some tests require multiple variables
* fix(docker): variable substitution
* fix(ci): allow to run multiple commits from a PR at once
* fix(docker): lower the NETWORK env var for test names
* reduce uneeded diff
* imp(keys): use better naming for builds_disks
* imp(ci): use input defaults
* imp(ci): remove test_name in favor of test_id
* fix(ci): better key naming
* fix(ci): long disk names breaks GCP naming convention
* feat(ci): validate local state version with cached state
* fix(ci): add condition to run tests
* fix: typo
* fix: app_name should not be required
* fix: zebra_state_path shouldn't be required
* fix: reduce diff
* fix(ci): checkout to grep local state version
* Update .github/workflows/test.yml
Co-authored-by: teor <teor@riseup.net>
* revert: merge all tests into a single workflow
* Remove unused STATE_VERSION env var
* fix: minor fixes
* fix(ci): make test.patch the same as test
* fix(ci): negate the input value
* imp(ci): better cached state conditional handling
* imp(ci): exit code is captured by `docker run`
* fix(deploy): mount disks with better write performance
* fix(ci): change sync id to a broader id name
* fix(ci): use correct input validation
* fix(ci): do not make test with cached state dependant on other
* imp(ci): organiza keys better
* fix(ci): use appropiate naming
* fix(ci): create docker volume before mounting
* fix(lint): do not fail on all new changes
* imp(ci): do not report in pr review
* fix(ci): partition clean disks
* fix: typo
* fix: test called the wrong way
* fix(build): stop using gha cache
* ref(ci): validate run condition before calling reusable workflow
* fix(ci): use a better filesystem dir and fix other values
* fix: linting errors
* fix(ci): typo
* Revert "fix(build): stop using gha cache"
This reverts commit a8fbc5f416.
Cache expiration is a lesser evil than not using caching at all and then failing with a 401
* imp(ci): do not set a default for needs_zebra_state
* Update .github/workflows/test.yml
Co-authored-by: teor <teor@riseup.net>
* fix(deps): remove dependencies
* force build
* Update .github/workflows/test.yml
Co-authored-by: teor <teor@riseup.net>
* fix(docker): add RUST_LOG as an ARG and ENV
* fix(test): add `#[ignore]` to send transactions test
This test needs state then it should be marked as #[ignore]
* fix(ci): differentiate between root cache path and its dir
* Remove extra `state` directory
That was a workaround for an issue that has been fixed.
* imp(docs): use better test descriptions
Co-authored-by: teor <teor@riseup.net>
* fix: reduce unwanted diff with main
* fix(ci): make lwd conditions consistent
* Remove another extra `state` directory
Was also part of a workaround for an issue that has been fixed.
* fix(ci): use better conditionals to run test jobs
Co-authored-by: teor <teor@riseup.net>
* Tweak to support different lightwalletd versions
Some versions print `Waiting for block`, and some versions print
`Ingestor waiting for block`.
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* refactor(lint): check specific files for each job
* refactor(lint): use an approach which requires less code
* fix(lint): validate against true string not boolean
* imp(build): reduce docker cache invalidation
Use scoped caching and more file ignores to reduce cache invalidation
* fix(build): add entrypoint.sh as a required file
* fix(build): do not logout if the build takes too long
* Add 'doc comment' about .dockerignore
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* refactor(ci): use docker in docker
This is a workaround for an issue related to disk partitioning, caused by a GCP service called Konlet, while mounting the cached disks to the VM and then to the container
* fix(build): persist docker login credentials
* fix(ci): get sync height from docker logs instead of gcp
* try: use gha cache for faster building
* fix(ci): mount disk in container to make it available in vm
* fix(build): do not invalidate cache between images
* try(docker): invalidate cache as less as possible
* fix(ci): GHA terminal is not a TTY
* fix(build): do not ignore entrypoint.sh
* fix
* fix(ci): mount using root priveleges
* fix(ci): use existing disk as cached state
* fix(ci): wait for disks to get mounted
* force rebuild
* fix failed force
* fix(ci): some tests does not use a cached state
* fix(ci): do not name boot and attached disk the same
* fix(ci): attach a disk to full sync, to snapshot the state
* fix(ci): use appropiate grep text depending on the test
* reduce diff
* fix(ci): use correct GCP disk source attribute
* imp(ci): reduce diff
* fix(ci): revert wrong deletion
* fix: revert uneeded changes
* fix: reduce main diff
* fix
* fix(ci): reduce diff
* fix(ci): garbage collect instances no matter the status
As we're not going to reuse test instances, the safest method to apply is to always delete this instances if they fail, get skipped or succeed running a workflow
* Apply suggestions from code review
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* docs(ci): imrpove comment
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* refactor(ci): test building in a separate workflow
* force a change
* force a change
* fix(ci): send the correct variables to the reusable build
* fix(ci): variables are not allowed
* fix(ci): conditions are not allowed as input
* fix(ci): use expected value
* refactor(build): simplify the use of other dockerfiles
* fix(cd): depend on docker build yml
* fix(cd): use main branch as image name
* imp(actions): remove uneeded variable repetition
* imp(build): remove unused variables
* imp(actions): rename the image building workflow
Not all images are for zebra execution as we also have one for zcash-params
* fix(ci): add dependable workflow in paths filters
* docs(ci): remove TODO as this won't be needed at least an issue arises
* docs(ci): CARGO_INCREMENTAL can decrease build time when running from a cache
* fix: revert forced changes
* fix(build): remove unused build inputs in zcash-params
* imp(cd): as this is the production image, use the executable name
* imp(ci): reduce log level to improve speed
Co-authored-by: teor <teor@riseup.net>
* imp(ci): use the correct name for the workflow
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: teor <teor@riseup.net>
* fix(ci): do not delete instances from `main` branch on merge
* fix(ci): do not delete instances on merge
This was creating an unintended behavior, and so far instances are being cleaned up in its corresponding workflow.
* feat(ci): run cached state rebuilds in main branch
* fix(ci): allow the PR/branch name in the disk name
Move the hight information to the disk description, to reduce the name length
* fix(ci): add missing SHA
* fix(ci): regenerate chekpoint cached state on main
This will automatically regenerate the disk when a merge is completed on main
* tmp(ci): do not duplicate sync test at/after merge
This temporarily ensure the test just runs in the main branch, ensuring we can track it easier
* fix(ci): correctly use lowered network caps
In the Test workflow we were using a different approach than the one being used in the Full sync test.
Also, in the Full sync test the variable was LOWER_NET_NAME, but NETWORK was being used in the disk name, with caps.
* imp(ci): get state version from local constants.rs
* imp(ci): use the same get name approach
* fix(ci): use the correct name for state version variable
* imp(ci)!: use different disk names for cached states
Disk states synced to canopy and synced to the chain tip should have different names to reference correctly on actual and coming tests the needed disk.
* imp(ci): test-stateful-sync no longer depends on regenerate-stateful-disks
* Apply suggestions from code review
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* fix(ci): use a better name for network string conversion
* Revert "Apply suggestions from code review"
This reverts commit cbbfaf4e9c.
* fix: do not get log information if sync was skipped
* fix(ci): do not lower the variable name
* fix(ci): use the same lowering case for network everywhere
* test: more .dockerignore conditions
* fix: use the right approach to lower caps
* remove extra .dockerignore
* trigger a change for stateful disk regeneration
* imp(ci): use `checkpoint` as the disk reference
* revert wrong delete
* fix(ci): add INSTANCE_ID and correct logging message
* imp(ci): add `v` prefix to state version number
* fix(ci): remove typo from logging message to get the height
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* style(ci): comply with https://json.schemastore.org/github-workflow.json
Some substituions were harder to make as files were not standardized
* fix(mergify): use correct name for macos
* style(actions): revert to single quotes
* style: lint dependabot and mergify conf files
* style: remove conditions with missing context
* imp(lint): automate GH Actions linting
* fix(lint): some actions need to be triggered by PR event
* fix(lint): consider all workflow YAMLs
* Use the same paths in the patch file
* revert: keep condition as is
* add TODO
* fix: add missing checkpoint_sync input
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* Change OutputLocation to contain a TransactionLocation
* Change OutputLocation reads from the database
* Update some doc comments
* Update some TODOs
* Change deleting spent UTXOs and updating spent balances
* Change adding new UTXOs and adding their values to balances
* Disable dead code warnings
* Update snapshot test code
* Update round-trip tests for OutputLocations
* Update snapshot test data
* Increment the database format version
* Remove a redundant try_into()
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Refactor redundant code
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* ci: attempt at fixing 'Regenerate stateful disks'
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* fix(ci): validate exit code
* fix(ci): use a single step
* fix: run from gcp command
* fix: remove variable substitution
* fix: escaping characters
* fix: handle bash issues with variable expansion
* fix: seggregate execution into multiple steps
* fix: the command is a string
* fix: move docker ignore to use caching
* revert panic
* fix: add panic and fix exit code command
* revert: remove panic
* fix: apply the exit code to all gcp tests
* imp(ci): use single line exit
* Clean up the CODEOWNERS file
We moved the dockerignore file into the docker directory,
so it doesn't need a separate entry any more.
Co-authored-by: teor <teor@riseup.net>
* Create disk image after a successful full sync test
* Extract full sync height and name zebrad cached state with it
* Read 500 lines to extract sync height
* Restrict log query to just the container output and fix regex syntax for ubuntu
* Explicitly search logs in descending time
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* Update changelog for v1.0.0-beta.7
* Increment all crate versions
* Remove redundant release test that is now covered by CI
* Remove completed NU5 README check task from the release template
* Add Merge Freeze tool to the release checklist
* Simplify release checklist by removing unused steps
Only run the state rebuild job if the database format version has (likely) changed.
If we have accidentally changed the format, but not changed the version,
we want to run with the old cached state, so this job fails.
If we change the state path without changing the version,
this job will take a few hours, because it will do a full rebuild.
When a PR is created and an image is built in a branch, the cache is also pushed to a `buildcache` tag.
As all PRs are using the same tag, sometimes the cache for a specific branch gets invalidated and makes it take longer on further pushes.
This fix might make the first commit take longer, but further ones will be faster if no changes in the code are applied.
* Update test.patch.yml with lightwalletd job
* Remove a workflow condition that will always be false
In general, patch workflows need the
opposite conditions to the original workflow.
But in this case, we know the result of the
condition will always be true, so we can just delete it.
Co-authored-by: teor <teor@riseup.net>
* fix(actions): use a specific shortening length for SHAs
The rlespinasse/github-slug-action now works without checking out the code, reduce time and improving security with following actions.
This requires to specify the GITHUB_SHA_SHORT variable length, as git uses 8 by default, but docker uses 7 by default.
* fix(actions): target correct rlespinasse/github-slug-action version
* fix(actions): just use major version
* fix(actions): github-slug-action is not being correctly referenced
* refactor(ci): use improved OIDC authentication
* fix(ci): standardize OIDC on all required jobs
* fix: wrong indentation
* fix(ci): remove non existing depency in clean job
GitHub's GHA cache gets invalidated at 10Gb, which is very easy to hit when we're building multiple times a day with several commits.
Instead use the registry, which won't get invalidated until a change is identified in the build process.
* refactor(test): reuse same GCP instance on a single PR
This also ensures the deployments are faster, and we only delete the instance when merging or closing the PR, instead of doing it on each push to the PR
* fix(deploy): add zone to updates
* fix: typo
* fix(ci): improve conditions for updates
* fix(deploy): delete old deployments instead of reusing it
* fix(deploy): keep delete command after run
* fix(deploy): always create an instance
* fix(deploy): delete disks on every delete command.
* imp(ci): use better id name
* Update .github/workflows/test.yml
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* imp: handle errors correctly on deletion
* fix: do not hide valid errors
* fix: edge case where the container is not ready yet
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* doc(db): fix some comments
* refactor(db): split disk serialization types into their own module
* refactor(db): split the disk format into modules
* doc(db/test): explain the RON serialization format
* fix(ci): only run the full sync test on mergify queue PRs
* fix inconsistent indenting and syntax
* Update .github/workflows/test-full-sync.yml
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* fix: check TEST_FAKE_ACTIVATION_HEIGHTS at runtime
* fix(tests): add TEST_FAKE_ACTIVATION_HEIGHTS variable
This variable ensures the test is activated in the `test-fake-activation-heights` step
* fix(docker): do not run specific tests by default in entrypoint.sh
* fix(test): remove extra TEST_FULL_SYNC argument
* imp(timeout): wait for an average build time
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* fix: add missing job key
* fix(arm64): bump timeout to build without cache
* fix(deployment): apply changes made in 5004c4d3a1
* fix: remove uneeded condition
* refactor(tests): make vm names refer the test name
This also adds a build step for full sync, as there won't be a reference image when using workflow_dispatch
* fix(deployment): testing depends on the built image
* refactor(test): decouple full sync from other tests
As the full sync requires to be run just once and isolated, we're running this test in a separate workflow, after a PR has been approved.
* fix: revert to previous conditions in job regenerate-stateful-disks
* fix(condition): get disk sha if regeneration is not executed
* fix: typo
* Update .github/workflows/test-full-sync.yml
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* fix(build): bump build time for arm64
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* refactor(build): use OCI Image Format Specification for labels
This should also fix when an image gets built multiple times using the cache, as each image differs in labels
* refactor(tags): use PR context sha and ref
Remove the needed of PR Head SHA and Ref, as those can cause conflict depending on how the branch name has been established
* fix(ci): remove an unused trigger path
* doc(ci): explain lightwalletd trigger paths
* fix(test): check for adityapk00/lightwalletd behaviour in test harness
* fix(ci): work around buildx command error
* fix(ci): revert the workaround
* add(actions): lightwalletd continous integrations
* refactor(actions): build lightwalletd and reuse it in zebra
- Download lightwalletd source code
- Create a new Dockerfile for lightwalletd
- Use lightwalletd binary in Zebra's image
- Create a specific step to build/update lightwalletd
- Add lightwalletd integration test to the test suite
- Remove lightwalletd.yml, as it was harder to control
* refactor(docker): organize Dockerfiles and remove unused
Fixes: #3344
* feat(build): add arrm64 support
* fix(build): do not install google-compute-engine in arm64
This package is not available for this platform
* fix(build): do not build arm64 for tests
* fix(condition): indent for better visibility
* fix(condition): wrong use of operators
* ci(test): re-run tests when snapshot data changes
* fix(ci): rebuild state when disk format changes
* fix(ci): rebuild rust docs when code or dependencies change
* doc(ci): explain why we run jobs when files change
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* add(tests): full sync test
* fix(test): add build
* fix(deploy): escape double dashes '--' correctly
* fix(test): remove unexpected --no-capture arg
error: Found argument '--nocapture' which wasn't expected, or isn't valid in this context
* refactor(docker): use default executable as entrypoint
* refactor(startup): add a custom entrypoint
* fix(test): add missing TEST_FULL_SYNC variable
* test(timeout): use the biggest machine
* fix
* fix(deploy): use latest successful image
* typo
* refactor(docker): generate config file at startup
* revert(build): changes were made to docker
* fix(docker): send variables correctly to the entrypoint
* test different conf file approach
* fix(env): add RUN_TEST env variable
* ref: use previous approach
* fix(color): use environment variable
* fix(resources): use our normal machine size
* fix(ci): double CPU and RAM for full sync test
* fix(test): check for zebrad test output in the correct order
The mempool is only activated once, so we must check for that log first.
After mempool activation, the stop regex is logged at least once.
(It might be logged before as well, but we can't rely on that.)
When checking that the mempool didn't activate,
wait for the `zebrad` command to exit,
then check the entire log.
* fix(ci): run full sync test with full compiler optimisations
* fix(tests): reintroduce tests and run full sync on approval
* fix(tests): reduce the changelog
Co-authored-by: teor <teor@riseup.net>
* add(actions): lightwalletd continous integrations
* refactor(actions): build lightwalletd and reuse it in zebra
- Download lightwalletd source code
- Create a new Dockerfile for lightwalletd
- Use lightwalletd binary in Zebra's image
- Create a specific step to build/update lightwalletd
- Add lightwalletd integration test to the test suite
- Remove lightwalletd.yml, as it was harder to control
* fix(build): remove extra port being exposed
* fix(lightwalletd): test should be after `--` in cargo test
* revert(lint): do not lint external code as it can be confusing
* fix(test): lightwalletd_integration test is not ignored
* docs(docker): clarify the addition of unused args
* refactor(docker): organize Dockerfiles and remove unused
Fixes: #3344
* fix(actions): activate workflows on correct path changes
* test
* revert previous commit
* feat(build): add arm64 support with cross-compilation (#3659)
* feat(build): add arrm64 support
* fix(build): do not install google-compute-engine in arm64
This package is not available for this platform
* fix(build): do not build arm64 for tests
* fix(changes): reduce changelog
* Revert "feat(build): add arm64 support with cross-compilation (#3659)"
This reverts commit 291e00c405.
* feat(codeowners): add code owners in repository
* fix(path): recently split out crate
Co-authored-by: teor <teor@riseup.net>
* fix(teams): use reviewers instead of owners name
* fix(teams): wrong team name
* docs: use correct default explanation
* fix(path): add extra paths to devops team
Co-authored-by: teor <teor@riseup.net>
* refactor(state): move disk_db reads to a new zebra_db module
* refactor(state): make finalized value pool method names consistent
* refactor(state): split database writes into the zebra_db module
* refactor(state): move the block batch method to DiskWriteBatch
* refactor(state): actually add the zebra_db module
Unfortunately, I've lost the interim changes to this file,
so this commit might be the only one that compiles.
* refactor(state): add a newly created file to the cached state CI job
* Run Coverage collection on main
Resolves#3533
* fix(coverage): just run coverage on specific file changes to main
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* fix(dependencies): update an unused duplicate dependency exception
This duplicate was removed by PR #3572, but other duplicates still exist.
* feat(ci): check for duplicate dependencies with optional features off
* fix(mergify, actions): use better names and require tests
* feat(queue): do not update the actual PR, create a draft
Do not allow to update/rebase the original pull request to check its mergeability. Create a draft pull request instead.
This doesn't add Mergify as a co-author
* feat(queue): do not interrupt already running queues
Our queues might take more than 5 hours even if the priority is low.
Do not allow interrupting the ongoing speculative checks when a pull request with higher priority enters in the queue.
* fix(mergify): move 'allow' attributes to queue_rules
* fix(mergify): attributes are not conditions
* refactor(state): split the disk_format module
* refactor(ci): add the new disk_db file to the state CI list
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* fix(ci): clarify ignored test name
`--include-ignored` runs all tests, including tests
that would normally be ignored.
`-Zunstable-options` enables all unstable options,
but it doesn't do anything by itself.
There is a lot of overlap with "test-all" in this job,
which we might want to fix in a future PR.
* fix(ci): remove unused -Zunstable-options
`--include-ignored` is now stable, so `unstable-options` is not needed.
* fix(test): delete a redundant test
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* fix(test): use the short SHA from actual run if valid
* fix(test): if condition must evaluate to a single false
* fix(test): do not run logs and upload if not needed
* imp(test): allow test stateful sync after disk regeneration
This takes is fast enough, so it shouldn't do any harm if run just after a ~3 hours test
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Dependabot creates branches with versions using a dot notation, and some tests fails because of this
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* refactor (cd): overall pipeline improvement
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
* refactor (cd): overall pipeline improvement
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
* refactor (cd): use newer google auth action
* fix (cd): use newer secret as gcp credential
* fix (docker): do not create extra directories
* fix (docker): ignore .github for caching purposes
* fix (docker): use latest rust
* fix (cd): bump build timeout
* fix: use a better name for manual deployment
* refactor (docker): use standard directories for executable
* fix (cd): most systems expect a "latest" tag
Caching from the latest image is one of the main reasons to add this extra tag. Before this commit, the inline cache was not being used.
* fix (cd): push the build image and the cache separately
The inline cache exporter only supports `min` cache mode. To enable `max` cache mode, push the image and the cache separately by using the registry cache exporter.
This also allows for smaller release images.
* fix (cd): remove unused GHA cache
We're leveraging the registry to cache the actions, instead of using the 10GB limits from Github Actions cache storage
* refactor (cd): use cargo-chef for caching rust deps
* fix: move build system deps before cargo cheg cook
* fix (release): use newer debian to reduce vulnerabilities
* fix (cd): use same zone, region and service accounts
* fix (cd): use same disk size and type for all deployments
* refactor (cd): activate interactive shells
Use interactive shells for manual and test deployments. This allow greater flexibility if troubleshooting is needed inside the machines
* refactor (test): use docker artifact from registry
Instead of using a VM to SSH into in to build and test. Build in GHA (to have the logs available), run the workspace tests in GHA, and just run the sync tests in GCP
Use a cintainer VM with zebra's image directly on it, and pass the needed parameters to run the Sync past mandatory checkpoint.
* tmp (cd): bump timeout for building from scratch
* tmp (test): bump build time
* fix (cd, test): bump build time-out to 210 minutes
* fix (docker): do not build with different settings
Compiling might be slow because different steps are compiling the same code 2-4 times because of the variations
* revert (docker): do not fix the rust version
* fix (docker): build on the root directory
* refactor(docker): Use base image commands and tools
* fix (cd): use correct variables & values, add build concurrency
* fix(cd): use Mainnet instead of mainnet
* imp: remove checkout as Buildkit uses the git context
* fix (docker): just Buildkit uses a .dockerignore in a path
* imp (cd): just use needed variables in the right place
* imp (cd): do not checkout if not needed
* test: run on push
* refactor(docker): reduce build changes
* fix(cd): not checking out was limiting some variables
* refactor(test): add an multistage exclusive for testing
* fix(cd): remove tests as a runtime dependency
* fix(cd): use default service account with cloud-platform scope
* fix(cd): revert checkout actions
* fix: use GA c2 instead of Preview c2d machine types
* fix(actions): remove workflow_dispatch from patched actions
This causes GitHub confusion as it can't determined which of the actions using workflow_dispatch is the right one
* fix(actions): remove patches from push actions
* test: validate changes on each push
* fix(test): wrong file syntax on test job
* fix(test): add missing env parameters
* fix(docker): Do not rebuild to download params and run tests
* fix(test): setup gcloud and loginto artifact just when needed
Try not to rebuild the tests
* fix(test): use GCP container to sync past mandatory checkpoint
* fix(test): missing separators
* test
* fix(test): mount the available disk
* push
* refactor(test): merge disk regeneration into test.yml
* fix(cd): minor typo fixes
* fix(docker): rebuild on .github changes
* fix(cd): keep compatibility with gcr.io
To prevent conflicts between registries, and migrate when the time is right, we'll keep pushing to both registries and use github actions cache to prevent conflicts between artifacts.
* fix(cd): typo and scope
* fix(cd): typos everywhere
* refactor(test): use smarter docker wait and keep old registry
* fix(cd): do not constraint the CPUs for bigger machines
* revert(cd): reduce PR diff as there's a separate one for tests
* fix(docker): add .github as it has no impact on caching
* fix(test): run command correctly
* fix(test): wiat and create image if previous step succeded
* force rebuild
* fix(test): do not restrict interdependant steps based on event
* force push
* feat(docker): add google OS Config agent
Use a separate step to have better flexibility in case a better approach is available
* fix(test): remove all hardoced values and increase disks
* fix(test): use correct commands on deploy
* fix(test): use args as required by google
* fix(docker): try not to invalidate zebrad download cache
* fix(test): minor typo
* refactor(test): decouple jobs for better modularity
This also allows faster tests as testing Zunstable won't be a dependency and it can't stop already started jobs if it fails.
* fix(test): Do not try to execute ss and commands in one line
* fix(test): do not show undeeded information in the terminal
* fix(test): sleep befor/after machine creation/deletion
* fix(docker): do not download zcash params twice
* feat(docker): add google OS Config agent
Use a separate step to have better flexibility in case a better approach is available
* merge: docker-actions-refactor into docker-test-refactor
* test docker wait scenarios
* fix(docker): $HOME variables is not being expanded
* fix(test): allow docker wait to work correctly
* fix(docker): do not use variables while using COPY
* fix(docker): allow to use zebrad as a command
* fix(cd): use test .yml from main
* fix(cd): Do not duplicate network values
The Dockerfile has an ARG with a default value of 'Mainnet', if this value is changed it will be done manually on a workflow_dispatch, making the ENV option a uneeded duplicate in this workflow
* fix(test): use bigger machine type for compute intensive tasks
* refactor(test): add tests in CI file
* fix(test): remove duplicated tests
* fix(test): typo
* test: build on .github changes temporarily
* fix(test): bigger machines have no effect on sync times
* feat: add an image to inherit from with zcash params
* fix(cd): use the right image name and allow push to test
* fix(cd): use the right docker target and remove extra builds
* refactor(docker): use cached zcash params from previous build
* fix(cd): finalize for merging
* imp(cd): add double safety measure for production
* fix(cd): use specific SHA for containers
* fix(cd): use latest gcloud action version
* fix(test): use the network as Mainnet and remove the uppercase from tests
* fix(test): run disk regeneration on specific file change
Just run this regeneration when changing the following files:
https://github.com/ZcashFoundation/zebra/blob/main/zebra-state/src/service/finalized_state/disk_format.rshttps://github.com/ZcashFoundation/zebra/blob/main/zebra-state/src/service/finalized_state.rshttps://github.com/ZcashFoundation/zebra/blob/main/zebra-state/src/constants.rs
* refactor(test): seggregate disks regeneration from tests
Allow to regenerate disks without running tests, and to run tests from previous disk regeneration.
Disk will be regenerated just if specific files were changed, or triggered manually.
Tests will run just if a disk regeneration was not manually triggered.
* fix(test): gcp disks require lower case conventions
* fix(test): validate logs being emmited by docker
GHA is transforming is somehow transforwing the variable to lowercase also, so we're changint it to adapt to it
* test
* fix(test): force tty terminal
* fix(test): use a one line command to test terminal output
* fix(test): always delete test instance
* fix(test): use short SHA from the PR head
Using the SHA from the base, creates confusion and it's not accurate with the SHA being shown and used on GitHub.
We have to keep both as manual runs with `workflow_dispatch` does not have a PR SHA
* fix(ci): do not trigger CI on docker changes
There's no impact in this workflow when a change is done in the dockerfile
* Instead of runing cargo test when the instance gets created, run this commands afterwards in a different step.
As GHA TTY is not working as expected, and workarounds does not play nicely with `gcloud compute ssh` actions/runner#241 (comment) we decided to get the container name from the logs, log directly to the container and run the cargo command from there.
* doc(test): document reasoning for new steps
* fix(test): increase machine type and ssh timeout
* fix(test): run tests on creation and follow container logs
This allows to follow logs in Github Actions terminal, while the GCP container is still running.
Just delete the instance when following the logs ends successfully or fails
* finalize(test): do not rebuild image when changing actions
* fix(test): run tests on creation and follow container logs
This allows to follow logs in Github Actions terminal, while the GCP container is still running.
Just delete the instance when following the logs ends successfully or fails
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
The keyword is `paths` and the actions were using `path`
That's the reason why most actions have been running, and there's been no impact in time savings
* fix(zcash-params): Do not update parameters image on PR
We should not update a direct dependency of our Docker image to be writeable by a PR from anywhere, a local branch or a fork branch, before that change has been approved by a human and merged to #main
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* refactor (cd): overall pipeline improvement
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
* refactor (cd): overall pipeline improvement
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
* refactor (cd): use newer google auth action
* fix (cd): use newer secret as gcp credential
* fix (docker): do not create extra directories
* fix (docker): ignore .github for caching purposes
* fix (docker): use latest rust
* fix: use a better name for manual deployment
* refactor (docker): use standard directories for executable
* fix (cd): most systems expect a "latest" tag
Caching from the latest image is one of the main reasons to add this extra tag. Before this commit, the inline cache was not being used.
* fix (cd): push the build image and the cache separately
The inline cache exporter only supports `min` cache mode. To enable `max` cache mode, push the image and the cache separately by using the registry cache exporter.
This also allows for smaller release images.
* fix (cd): remove unused GHA cache
We're leveraging the registry to cache the actions, instead of using the 10GB limits from Github Actions cache storage
* refactor (cd): use cargo-chef for caching rust deps
* fix (release): use newer debian to reduce vulnerabilities
* fix (cd): use same zone, region and service accounts
* fix (cd): use same disk size and type for all deployments
* refactor (cd): activate interactive shells
Use interactive shells for manual and test deployments. This allow greater flexibility if troubleshooting is needed inside the machines
* fix (docker): do not build with different settings
Compiling might be slow because different steps are compiling the same code 2-4 times because of the variations
* fix(cd): use Mainnet instead of mainnet
* fix(docker): remove tests as a runtime dependency
* fix(cd): use default service account with cloud-platform scope
* fix(cd): keep compatibility with gcr.io
To prevent conflicts between registries, and migrate when the time is right, we'll keep pushing to both registries and use github actions cache to prevent conflicts between artifacts.
* fix(docker): do not download zcash params twice
* feat(docker): add google OS Config agent
Use a separate step to have better flexibility in case a better approach is available
* fix(docker): allow to use zebrad as a command
* feat: add an image to inherit from with zcash params
* refactor(docker): use cached zcash params from previous build
* imp(cd): add double safety measure for production
* style: use global variables and don't double print
Remove repeated instances of global environment variables. Do not print ENV variables on the terminal as GitHub Actions already shows it.
* fix (actions): Use fixed major versions for actions
As actions get recurrent fixes, using a specific version causes more maintance on the pipelines.
On the other hand, using @master versions could make some action unreliable, as breaking changes might be included without further notice, and even change behavior on a daily basis.
* refactor: make better use of ENV variables
A whole step with refex was being used to extract different variables from GitHub's environment. This gets depecrated in favor of using `rlespinasse/github-slug-action@v4` which has slug URL variables.
A SLUG on a variable will:
- put the variable content in lower case
- replace any character by - except 0-9, a-z, ., and _
- remove leading and trailing - character
- limit the string size to 63 characters
This changes also takes care of using the Head or Base branch for deployments. This will allow us tomerge of workflows, as most steps on this deployment actions are very similar, with little variations between workflows.
* fix (actions): use secrets for sensitive information
* revert: use specific versions for dependabot
Reverting commit 8c93409902
* Segregate linting jobs from CI workflow
Lint on push to all branches, except for main, as this action will be required to merge.
Just run the lint action when a Rust file is changed, as it won't make sense to run it on other scenarios.
DRY with uneeded jobs
* Make actions dependable on changed files or folders
* Fix & add missing paths
* Revert changes removing cargo.lock and deny.toml checks
Also refactor this to use a more redable and change prone cargo-deny-action. And move this actions out of the clippy-deps job, as this are more related to CI than linting.
* Fix wrong indentation
* Add new configuration file from #3386
* Do not fail on licenses as this configuration is missing
* Do not add advisories features
Add advisories checks in a different PR
* Allow tests and coverage on PR series
If we only run CI on branches that are going to merge to main, then PR series become a lot harder to test. (Because each PR is based on the previous PR, not main.)
This causes the Mergify bot to commit to each PR, being also included in the squashed merge as an author.
As the queue merges the head branch (main) to latest tip before testing with the CI, having all those feature branches constantly updating with Mergify is not needed
* fix: Use the correct conditions and merge method
Mergifys Status Checks conditions are based on the job name, not the worflow name. As our worflows have dynamic names, each variant must be considered.
Squash merges are the default being used in the Zebra repo, so mergify must comply with this configuration.
Use condition operators for labels in each pull request rule; previously it was expecting both labels to be set. And update names accordingly.
* fix: Allow mergify to merge dependabot PRs
Also adapt dependabot's configuration to use the recently adapted labels
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* Add mergify merging queues
* Fix Mergify invalid configuration
* Improve adaptability to the actual workflow
Do not merge if the pull-request test is not green. Do not move draft PRs to the queue. And update keep all open PRs updated.
* Fix a typo on check-success condition
* Move `MockedClientHandle` to `peer` module
It's more closely related to a `Client` than the `PeerSet`, and this
prepares it to be used by other tests.
* Rename `MockedClientHandle` to `ClientTestHarness`
Reduce confusion, and clarify that the client is not mocked.
Co-authored-by: teor <teor@riseup.net>
* Add clarification to `mock_peers` documentation
Explicitly say how the generated data is returned.
* Rename method to `wants_connection_heartbeats`
The `Client` service only represents one direction of a connection, so
`is_connected` is not the exact term.
Co-authored-by: teor <teor@riseup.net>
* Mock `Client` instead of `LoadTrackedClient`
Move where the conversion from mocked `Client` to mocked
`LoadTrackedClient` in order to make the test helper more easily used by
other tests.
* Use `ClientTestHarness` in `initialize` tests
Replace the boilerplate code to create a fake `Client` instance with
usages of the `ClientTestHarness` constructor.
* Allow receiving requests from `Client` instance
Create a helper type to wrap the result, to make it easier to assert on
specific events after trying to receive a request.
* Allow inspecting the current error in the slot
Share the `ErrorSlot` between the `Client` and the handle, so that the
handle can be used to inspect the contents of the `ErrorSlot`.
* Allow placing an error into the `ErrorSlot`
Assuming it is initially empty. If it already has an error, the code
will panic.
* Allow gracefully closing the request receiver
Close the endpoint with the appropriate call to the `close()` method.
* Allow dropping the request receiver endpoint
Forcefully closes the endpoint.
* Rename field to `client_request_receiver`
Also rename the related methods to include
`outbound_client_request_receiver` to make it more precise.
Co-authored-by: teor <teor@riseup.net>
* Allow dropping the heartbeat shutdown receiver
Allows the `Client` to detect that the channel has been closed.
* Rename fn. to `drop_heartbeat_shutdown_receiver`
Make it clear that it affects the heartbeat task.
Co-authored-by: teor <teor@riseup.net>
* Move `NowOrLater` into a new `now-or-later` crate
Make it easily accessible to other crates.
* Add `IsReady` extension trait for `Service`
Simplifies checking if a service is immediately ready to be called.
* Add extension method to check for readiness error
Checks if the `Service` isn't immediately ready because a call to
`ready` immediately returns an error.
* Rename method to `is_failed`
Avoid negated method names.
Co-authored-by: teor <teor@riseup.net>
* Add a `IsReady::is_pending` extension method
Checks if a `Service` is not ready to be called.
* Use `ClientTestHarness` in `Client` test vectors
Reduce repeated code and try to improve readability.
* Create a new `ClientTestHarnessBuilder` type
A builder to create test `Client` instances using mock data which can be
tracked and manipulated through a `ClientTestHarness`.
* Allow configuring the `Client`'s mocked version
Add a `with_version` builder method.
* Use `ClientTestHarnessBuilder` in `PeerVersions`
Use the builder to set the peer version, so that the `version` parameter
can be removed from the constructor later.
* Use a default mock version where possible
Reduce noise when setting up the harness for tests that don't really
care about the remote peer version.
* Remove `Version` parameter from the `build` method
The `with_version` builder method should be used instead.
* Fix some typos and outdated info in the release checklist
* Add extra client tests for zero and multiple readiness checks (#3273)
And document existing tests.
* Replace `NowOrLater` with `futures::poll!` (#3272)
* Replace NowOrLater with the futures::poll! macro in zebrad
* Replace NowOrLater with the futures::poll! macro in zebra-test
* Remove the now-or-later crate
* remove unused imports
* rustfmt
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Zebra's latest beta continues implementing zero-knowledge proof and note commitment tree validation. In this release, we have finished implementing transaction header, transaction amount, and Zebra-specific NU5 validation. (NU5 mainnet validation is waiting on an `orchard` crate update, and some consensus parameter updates.)
We also fix a number of security issues that could pose a local denial of service risk, or make it easier for an attacker to make a node follow a false chain.
As of this release, Zebra will automatically download and cache the Sprout and Sapling Groth16 circuit parameters. The cache uses around 1 GB of disk space. These cached parameters are shared across all Zebra and `zcashd` instances run by the same user.
See CHANGELOG.md for the full list of changes in this release.
* Download and load Sprout parameters using zcash_proofs
Also update some librustzcash dependencies, to avoid duplicate dependencies.
* Update upstream orchard to avoid a compilation error
* Skip librustzcash batch refactor for now, to avoid compilation errors
* Change the cache ID, so we actually cache Sprout
* Move existing file checks into zcash_proofs
* Add a 1 hour timeout to parameter file downloads
* Give other tasks priority, before spawning the download task
* Update to the latest version of our modified librustzcash fork
* Change the cache key for Sprout
* Add 40 minutes to CI timeouts for occasional sprout downloads
* Update to zcash_proofs with split downloads
* Check file sizes to help debug parameter load failures in zcash_proofs
* Start the second download once the first has finished in zcash_proofs
* Document the parameter download task
* Stop hashing existing files twice
* Move dependency checks to the clippy job
* Split the fake activation heights into their own job
* Fix expected types
* Minimise proptest cases on Windows, macOS, and coverage
We don't expect proptests to fail on different platforms.
* Replace Zcash parameters crates with pre-downloaded local parameter files
* Download Zcash parameters using the `zcashd` script in CI and Docker
* Add a zcash_proofs dependency to zebra-consensus
* Download Sapling parameters using zcash_proofs, rather than fetch-params.sh
* Add a new `zebrad download` subcommand
This command isn't required for nomrmal usage.
But it's useful when testing, or launching multiple Zebra instances.
* Use `zebrad download` in CI to pre-download parameters
* Log a helpful hint if downloading fails
* Allow some duplicate dependencies currently hidden by orchard
* Spawn a separate task to download Groth16 parameters
* Run the parameter download with code coverage
This avoids re-compining Zebra with and without coverage.
* Update Cargo.lock after rebase
* Try to pass `download` as an argument to `zebrad` in coverage CI
* Fix copy and paste comment typos
* Add path and download examples, like zcash_proofs
* Download params in CI just like zcash_proofs does
* Delete a redundant build step
* Implement graceful shutdown for zebrad start
* Send coverage summary to /dev/null when getting the params path
* Use the correct parameters path and download commands in CI
* Explain pre-downloads
* Avoid calling params_folder twice
* Rename parameter types and methods for consistency
```sh
fastmod SaplingParams SaplingParameters zebra*
fastmod Groth16Params Groth16Parameters zebra*
fastmod PARAMS GROTH16_PARAMETERS zebra*
fastmod params_folder directory zebra*
```
And a manual variable name tweak.
* rustfmt
* Remove a redundant coverage step
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Update `tower` to version `0.4.9`
Update to latest version to add support for Tokio version 1.
* Replace usage of `ServiceExt::ready_and`
It was deprecated in favor of `ServiceExt::ready`.
* Update Tokio dependency to version `1.13.0`
This will break the build because the code isn't ready for the update,
but future commits will fix the issues.
* Replace import of `tokio::stream::StreamExt`
Use `futures::stream::StreamExt` instead, because newer versions of
Tokio don't have the `stream` feature.
* Use `IntervalStream` in `zebra-network`
In newer versions of Tokio `Interval` doesn't implement `Stream`, so the
wrapper types from `tokio-stream` have to be used instead.
* Use `IntervalStream` in `inventory_registry`
In newer versions of Tokio the `Interval` type doesn't implement
`Stream`, so `tokio_stream::wrappers::IntervalStream` has to be used
instead.
* Use `BroadcastStream` in `inventory_registry`
In newer versions of Tokio `broadcast::Receiver` doesn't implement
`Stream`, so `tokio_stream::wrappers::BroadcastStream` instead. This
also requires changing the error type that is used.
* Handle `Semaphore::acquire` error in `tower-batch`
Newer versions of Tokio can return an error if the semaphore is closed.
This shouldn't happen in `tower-batch` because the semaphore is never
closed.
* Handle `Semaphore::acquire` error in `zebrad` test
On newer versions of Tokio `Semaphore::acquire` can return an error if
the semaphore is closed. This shouldn't happen in the test because the
semaphore is never closed.
* Update some `zebra-network` dependencies
Use versions compatible with Tokio version 1.
* Upgrade Hyper to version 0.14
Use a version that supports Tokio version 1.
* Update `metrics` dependency to version 0.17
And also update the `metrics-exporter-prometheus` to version 0.6.1.
These updates are to make sure Tokio 1 is supported.
* Use `f64` as the histogram data type
`u64` isn't supported as the histogram data type in newer versions of
`metrics`.
* Update the initialization of the metrics component
Make it compatible with the new version of `metrics`.
* Simplify build version counter
Remove all constants and use the new `metrics::incement_counter!` macro.
* Change metrics output line to match on
The snapshot string isn't included in the newer version of
`metrics-exporter-prometheus`.
* Update `sentry` to version 0.23.0
Use a version compatible with Tokio version 1.
* Remove usage of `TracingIntegration`
This seems to not be available from `sentry-tracing` anymore, so it
needs to be replaced.
* Add sentry layer to tracing initialization
This seems like the replacement for `TracingIntegration`.
* Remove unnecessary conversion
Suggested by a Clippy lint.
* Update Cargo lock file
Apply all of the updates to dependencies.
* Ban duplicate tokio dependencies
Also ban git sources for tokio dependencies.
* Stop allowing sentry-tracing git repository in `deny.toml`
* Allow remaining duplicates after the tokio upgrade
* Use C: drive for CI build output on Windows
GitHub Actions uses a Windows image with two disk drives, and the
default D: drive is smaller than the C: drive. Zebra currently uses a
lot of space to build, so it has to use the C: drive to avoid CI build
failures because of insufficient space.
Co-authored-by: teor <teor@riseup.net>
* Add default deny.toml for "cargo deny check bans"
`cargo deny init`
* Delete unused "cargo deny" config entries
Also cleanup trailing whitespace.
* Deny duplicate crates and unexpected crate sources
Allow the current set of duplicates and sources,
with references to the tickets that will fix them.
* Check for duplicate dependencies in CI
Also check for:
- unexpected crate sources
- outdated Cargo.lock
(required for accurate duplicate and source checks)
* Revert CI name changes so required statuses pass
* Fix ticket for sentry-tracing
* ZIP-401 weighted random mempool eviction
* rename zcash.mempool.total_cost.bytes to zcash.mempool.cost.bytes
Co-authored-by: teor <teor@riseup.net>
* Remove duplicated lines
* Add cost() method to UnminedTx
Update serialization failure messages
* More docs quoting ZIP-401 rules
* Change mempool::Storage::new() to handle Copy-less HashMap, HashSet
* mempool: tidy cost types and evict_one()
* More consensus rule docs
* Refactor calculating mempool costs for Unmined transactions
* Add a note on asympotic performance of calculating weights of txs in mempool
* Bump test mempool / storage config to avoid weighted random cost limits
* Use mempool tx_cost_limit = u64::MAX for some tests
* Remove failing tests for now
* Allow(clippy::field-reassign-with-default) because of a move on a type that doesn't impl Copy
* Fix mistaken doctest formatting
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Increase test timeout for Windows builds
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
Also only run the zebrad acceptance tests on macOS.
Re-running the compiler and test binaries for unused crates is slow in CI.
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* Add validation of ZIP-221 and ZIP-244 commitments
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
* Add auth commitment check in the finalized state
* Reset the verifier when comitting to state fails
* Add explanation comment
* Add test with fake activation heights
* Add generate_valid_commitments flag
* Enable fake activation heights using env var instead of feature
* Also update initial_tip_hash; refactor into progress_from_tip()
* Improve comments
* Add fake activation heights test to CI
* Fix bug that caused commitment trees to not match when generating partial arbitrary chains
* Add ChainHistoryBlockTxAuthCommitmentHash::from_commitments to organize and deduplicate code
* Remove stale comment, improve readability
* Allow overriding with PROPTEST_CASES
* partial_chain_strategy(): don't update note commitment trees when not needed; add comment
Co-authored-by: teor <teor@riseup.net>
* Only use -t flag to docker run, set SSH keep alive
* Remove SSH flag for now
* Add ssh flag back to test.yml gcloud compute ssh command
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* Update versions for zebra v1.0.0-alpha.12 release
* Update Cargo.lock
* Update release checklist with latest version changes to help keep track for future releases
* Remove reference to the fact that tower-fallback was not updated
Previously, Zebra's cached state workflows would run all of Zebra's
tests, but they would ignore the results for most tests. They would only
fail if the mainnet cached state test failed.
After this fix, the tests fail if any test or build step fails.
* Add new CHANGELOG.md file to zebra git repo
* Update Release Checklist to add updates to CHANGELOG.md
* Add some explanation about the CHANGELOG.md file
* Fix headings to make them consistent with Keep a changelog format
* Small fix for clarity
* Add release dates to changelog
* Change order of steps to update the changelog
Updates:
- GitHub Issue templates
- GitHub PR templates
- RFC template
Focusing on:
- consensus rule / network reference sections
- design sections
- review/test checklist
Process changes:
- add new team members to RFC approval
- change RFC approval to "most of the team"
And general cleanup:
- delete docs from the checklist, because we now `warn(missing_docs)`
- shorter explanations
- consistent headings
- consistent order
- consistent formatting
* Remove checkout credentials from CD action
* Remove checkout credentials from CI action
* Remove checkout credentials from coverage action
* Remove checkout credentials from docs action
* Remove checkout credentials from manual deploy action
* Remove checkout credentials from test action
* Remove checkout credentials from zcashd action
Previously, Zebra made ci-success a required check for merges to main. And then we made ci-success depend on a bunch of other CI checks.
But this doesn't work as expected, because if the dependent checks fail, ci-success is skipped, and the branch protection rules allow the branch to be merged to main.
* build(deps): bump vergen from 3.2.0 to 5.1.1
* fix hardcoded version for Tracing struct
* add additional metadata
* remove extra allocations for metadata
* Remove zebrad code version from release checklist
The zebrad code automatically uses the crate version now.
* Sort panic metadata into rough categories
Co-authored-by: teor <teor@riseup.net>
Use Powershell syntax to set ZEBRA_SKIP_NETWORK_TESTS on Windows.
Also skip the entire large sync test step on Ubuntu and
Windows, because the tests are skipped anyway due to
ZEBRA_SKIP_NETWORK_TESTS. This saves some
compilation time.
We used to always run the CI workflow on push/merge to #main and at some point stopped;
we still link to the status of this workflow on #main from our README. I think we should bring it back.
Also allows manual triggering of the workflow, which can come in handy if you are working
on a branch but haven't opened a PR yet.
* remove windows conditional
* fully separate tests from large tests
* add rust beta to new large test jobs
* increase build time for windows
* disable cargo increment
* Add draft PR template for release checklist
* Add some notes about keepachangelog categories
* Add excessive detail to the release checklist
We want to be very clear about the process now,
so we get consistency, and so other developers
can follow it.
Eventually, these details should move to the
developer book.
* Add links for Release Drafter and the GitHub Releases location
Co-authored-by: teor <teor@riseup.net>
* temporally disable sync_large_checkpoints from CI
* Allow large checkpoint sync tests only on ubuntu and macOS
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* switch to new llvm source based coverage
* upload artifact and simplified
* filter out irrelevant dependency coverage
* enable the correct filters on coverage
* correctly specify all binaries
* remove sparse flag from coverage
* update the coverage script organization
* fix typo in coverage script
* Create and mount persistent disk to store zebrad state, update runner container config to use
* Enable checkpoint sync in zebrad image config
* Lower state memory cache from 500MB to 50MB
* Upgrade host to n2-standard-4
* Bump zebrad-cache disk size to 100GB
* Copy zebrad as the tests are compiled with a hardcoded path to it
* Rename all debug binaries for easy invocation
* Name state cache disk, use the correct path to binaries
* Create volume and all that jazz on instance creation
Otherwise there's a lot of on-instance commands to do that is just handled by this shortcut.
* Explicitly mount the state cache and cleanup test instance
* Wait for zebra-test container to start then attach
* Always clean up even if the tests step fails
* Keep fast sleep but only print 'waiting' once
* Add review guidelines to the default PR template
* Apply suggestions
Co-authored-by: teor <teor@riseup.net>
* Add a Follow Up Work section to the PR template
* Mention design RFCs in the PR template
* Put key PR review questions in bold
* Tweak PR review "skip task" process
* Update .github/pull_request_template.md
Co-authored-by: teor <teor@riseup.net>
* Shorter alternative pull request template
Add Review and Follow Up sections
Add a checklist for documentation and tests
Co-authored-by: Alfredo Garcia <oxarbitrage@gmail.com>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
* Run large checkpoint sync tests in CI
* Improve test child output match error context
* Add a debug_stop_at_height config
* Use stop at height in acceptance tests
And add some restart acceptance tests, to make sure the stop at
height feature works correctly.
* export proptest impls for use in downstream crates
* add testjob for disabled feature in zebra-chain
* run rustfmt
* try to fix github actions syntax
* differentiate name
* prove that github action tests zebra-chain build without features
* revert change from last commit now that test is running
* remove accidentally introduced newline
* Update .github/workflows/ci.yml
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* attempt to use zcashconsensus crate in zebra-script
* boop
* update verify fn to use zebra types
* a bit more cleanup
* cleanup
* more
* beep boop
* fix renamed member
* cleaning
* get a real branch id
* remove as of yet unneeded api
* Update zebra-chain/src/transaction.rs
* Update zebra-chain/src/transaction.rs
* more cleanup
* oops wrong dep section
* use a tuple to communicate arg association
* update to use published version of zcash_script
* fix new compiler error
* install llvm on windows
* fix bindgen bug????
* try to get docker file to win
* okay try everything
* fix windows build maybe
* always download choco
* fix paths for moved types
* try a different error message
* try convenience script
* try installing just llvm
* add back one more
* try installing some headers
* try a diff package
* try everything
* remove the minimum
* try newer docker builder image
* cleanup docker image
* cleanup extra ci step