* refactor(test): decouple full sync from other tests
As the full sync requires to be run just once and isolated, we're running this test in a separate workflow, after a PR has been approved.
* fix: revert to previous conditions in job regenerate-stateful-disks
* fix(condition): get disk sha if regeneration is not executed
* fix: typo
* Update .github/workflows/test-full-sync.yml
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* fix(build): bump build time for arm64
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* refactor(build): use OCI Image Format Specification for labels
This should also fix when an image gets built multiple times using the cache, as each image differs in labels
* refactor(tags): use PR context sha and ref
Remove the needed of PR Head SHA and Ref, as those can cause conflict depending on how the branch name has been established
* fix(ci): remove an unused trigger path
* doc(ci): explain lightwalletd trigger paths
* fix(test): check for adityapk00/lightwalletd behaviour in test harness
* fix(ci): work around buildx command error
* fix(ci): revert the workaround
* add(actions): lightwalletd continous integrations
* refactor(actions): build lightwalletd and reuse it in zebra
- Download lightwalletd source code
- Create a new Dockerfile for lightwalletd
- Use lightwalletd binary in Zebra's image
- Create a specific step to build/update lightwalletd
- Add lightwalletd integration test to the test suite
- Remove lightwalletd.yml, as it was harder to control
* refactor(docker): organize Dockerfiles and remove unused
Fixes: #3344
* feat(build): add arrm64 support
* fix(build): do not install google-compute-engine in arm64
This package is not available for this platform
* fix(build): do not build arm64 for tests
* fix(condition): indent for better visibility
* fix(condition): wrong use of operators
* ci(test): re-run tests when snapshot data changes
* fix(ci): rebuild state when disk format changes
* fix(ci): rebuild rust docs when code or dependencies change
* doc(ci): explain why we run jobs when files change
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* add(tests): full sync test
* fix(test): add build
* fix(deploy): escape double dashes '--' correctly
* fix(test): remove unexpected --no-capture arg
error: Found argument '--nocapture' which wasn't expected, or isn't valid in this context
* refactor(docker): use default executable as entrypoint
* refactor(startup): add a custom entrypoint
* fix(test): add missing TEST_FULL_SYNC variable
* test(timeout): use the biggest machine
* fix
* fix(deploy): use latest successful image
* typo
* refactor(docker): generate config file at startup
* revert(build): changes were made to docker
* fix(docker): send variables correctly to the entrypoint
* test different conf file approach
* fix(env): add RUN_TEST env variable
* ref: use previous approach
* fix(color): use environment variable
* fix(resources): use our normal machine size
* fix(ci): double CPU and RAM for full sync test
* fix(test): check for zebrad test output in the correct order
The mempool is only activated once, so we must check for that log first.
After mempool activation, the stop regex is logged at least once.
(It might be logged before as well, but we can't rely on that.)
When checking that the mempool didn't activate,
wait for the `zebrad` command to exit,
then check the entire log.
* fix(ci): run full sync test with full compiler optimisations
* fix(tests): reintroduce tests and run full sync on approval
* fix(tests): reduce the changelog
Co-authored-by: teor <teor@riseup.net>
* add(actions): lightwalletd continous integrations
* refactor(actions): build lightwalletd and reuse it in zebra
- Download lightwalletd source code
- Create a new Dockerfile for lightwalletd
- Use lightwalletd binary in Zebra's image
- Create a specific step to build/update lightwalletd
- Add lightwalletd integration test to the test suite
- Remove lightwalletd.yml, as it was harder to control
* fix(build): remove extra port being exposed
* fix(lightwalletd): test should be after `--` in cargo test
* revert(lint): do not lint external code as it can be confusing
* fix(test): lightwalletd_integration test is not ignored
* docs(docker): clarify the addition of unused args
* refactor(docker): organize Dockerfiles and remove unused
Fixes: #3344
* fix(actions): activate workflows on correct path changes
* test
* revert previous commit
* feat(build): add arm64 support with cross-compilation (#3659)
* feat(build): add arrm64 support
* fix(build): do not install google-compute-engine in arm64
This package is not available for this platform
* fix(build): do not build arm64 for tests
* fix(changes): reduce changelog
* Revert "feat(build): add arm64 support with cross-compilation (#3659)"
This reverts commit 291e00c405.
* feat(codeowners): add code owners in repository
* fix(path): recently split out crate
Co-authored-by: teor <teor@riseup.net>
* fix(teams): use reviewers instead of owners name
* fix(teams): wrong team name
* docs: use correct default explanation
* fix(path): add extra paths to devops team
Co-authored-by: teor <teor@riseup.net>
* refactor(state): move disk_db reads to a new zebra_db module
* refactor(state): make finalized value pool method names consistent
* refactor(state): split database writes into the zebra_db module
* refactor(state): move the block batch method to DiskWriteBatch
* refactor(state): actually add the zebra_db module
Unfortunately, I've lost the interim changes to this file,
so this commit might be the only one that compiles.
* refactor(state): add a newly created file to the cached state CI job
* Run Coverage collection on main
Resolves#3533
* fix(coverage): just run coverage on specific file changes to main
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* fix(dependencies): update an unused duplicate dependency exception
This duplicate was removed by PR #3572, but other duplicates still exist.
* feat(ci): check for duplicate dependencies with optional features off
* fix(mergify, actions): use better names and require tests
* feat(queue): do not update the actual PR, create a draft
Do not allow to update/rebase the original pull request to check its mergeability. Create a draft pull request instead.
This doesn't add Mergify as a co-author
* feat(queue): do not interrupt already running queues
Our queues might take more than 5 hours even if the priority is low.
Do not allow interrupting the ongoing speculative checks when a pull request with higher priority enters in the queue.
* fix(mergify): move 'allow' attributes to queue_rules
* fix(mergify): attributes are not conditions
* refactor(state): split the disk_format module
* refactor(ci): add the new disk_db file to the state CI list
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* fix(ci): clarify ignored test name
`--include-ignored` runs all tests, including tests
that would normally be ignored.
`-Zunstable-options` enables all unstable options,
but it doesn't do anything by itself.
There is a lot of overlap with "test-all" in this job,
which we might want to fix in a future PR.
* fix(ci): remove unused -Zunstable-options
`--include-ignored` is now stable, so `unstable-options` is not needed.
* fix(test): delete a redundant test
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* fix(test): use the short SHA from actual run if valid
* fix(test): if condition must evaluate to a single false
* fix(test): do not run logs and upload if not needed
* imp(test): allow test stateful sync after disk regeneration
This takes is fast enough, so it shouldn't do any harm if run just after a ~3 hours test
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Dependabot creates branches with versions using a dot notation, and some tests fails because of this
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* refactor (cd): overall pipeline improvement
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
* refactor (cd): overall pipeline improvement
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
* refactor (cd): use newer google auth action
* fix (cd): use newer secret as gcp credential
* fix (docker): do not create extra directories
* fix (docker): ignore .github for caching purposes
* fix (docker): use latest rust
* fix (cd): bump build timeout
* fix: use a better name for manual deployment
* refactor (docker): use standard directories for executable
* fix (cd): most systems expect a "latest" tag
Caching from the latest image is one of the main reasons to add this extra tag. Before this commit, the inline cache was not being used.
* fix (cd): push the build image and the cache separately
The inline cache exporter only supports `min` cache mode. To enable `max` cache mode, push the image and the cache separately by using the registry cache exporter.
This also allows for smaller release images.
* fix (cd): remove unused GHA cache
We're leveraging the registry to cache the actions, instead of using the 10GB limits from Github Actions cache storage
* refactor (cd): use cargo-chef for caching rust deps
* fix: move build system deps before cargo cheg cook
* fix (release): use newer debian to reduce vulnerabilities
* fix (cd): use same zone, region and service accounts
* fix (cd): use same disk size and type for all deployments
* refactor (cd): activate interactive shells
Use interactive shells for manual and test deployments. This allow greater flexibility if troubleshooting is needed inside the machines
* refactor (test): use docker artifact from registry
Instead of using a VM to SSH into in to build and test. Build in GHA (to have the logs available), run the workspace tests in GHA, and just run the sync tests in GCP
Use a cintainer VM with zebra's image directly on it, and pass the needed parameters to run the Sync past mandatory checkpoint.
* tmp (cd): bump timeout for building from scratch
* tmp (test): bump build time
* fix (cd, test): bump build time-out to 210 minutes
* fix (docker): do not build with different settings
Compiling might be slow because different steps are compiling the same code 2-4 times because of the variations
* revert (docker): do not fix the rust version
* fix (docker): build on the root directory
* refactor(docker): Use base image commands and tools
* fix (cd): use correct variables & values, add build concurrency
* fix(cd): use Mainnet instead of mainnet
* imp: remove checkout as Buildkit uses the git context
* fix (docker): just Buildkit uses a .dockerignore in a path
* imp (cd): just use needed variables in the right place
* imp (cd): do not checkout if not needed
* test: run on push
* refactor(docker): reduce build changes
* fix(cd): not checking out was limiting some variables
* refactor(test): add an multistage exclusive for testing
* fix(cd): remove tests as a runtime dependency
* fix(cd): use default service account with cloud-platform scope
* fix(cd): revert checkout actions
* fix: use GA c2 instead of Preview c2d machine types
* fix(actions): remove workflow_dispatch from patched actions
This causes GitHub confusion as it can't determined which of the actions using workflow_dispatch is the right one
* fix(actions): remove patches from push actions
* test: validate changes on each push
* fix(test): wrong file syntax on test job
* fix(test): add missing env parameters
* fix(docker): Do not rebuild to download params and run tests
* fix(test): setup gcloud and loginto artifact just when needed
Try not to rebuild the tests
* fix(test): use GCP container to sync past mandatory checkpoint
* fix(test): missing separators
* test
* fix(test): mount the available disk
* push
* refactor(test): merge disk regeneration into test.yml
* fix(cd): minor typo fixes
* fix(docker): rebuild on .github changes
* fix(cd): keep compatibility with gcr.io
To prevent conflicts between registries, and migrate when the time is right, we'll keep pushing to both registries and use github actions cache to prevent conflicts between artifacts.
* fix(cd): typo and scope
* fix(cd): typos everywhere
* refactor(test): use smarter docker wait and keep old registry
* fix(cd): do not constraint the CPUs for bigger machines
* revert(cd): reduce PR diff as there's a separate one for tests
* fix(docker): add .github as it has no impact on caching
* fix(test): run command correctly
* fix(test): wiat and create image if previous step succeded
* force rebuild
* fix(test): do not restrict interdependant steps based on event
* force push
* feat(docker): add google OS Config agent
Use a separate step to have better flexibility in case a better approach is available
* fix(test): remove all hardoced values and increase disks
* fix(test): use correct commands on deploy
* fix(test): use args as required by google
* fix(docker): try not to invalidate zebrad download cache
* fix(test): minor typo
* refactor(test): decouple jobs for better modularity
This also allows faster tests as testing Zunstable won't be a dependency and it can't stop already started jobs if it fails.
* fix(test): Do not try to execute ss and commands in one line
* fix(test): do not show undeeded information in the terminal
* fix(test): sleep befor/after machine creation/deletion
* fix(docker): do not download zcash params twice
* feat(docker): add google OS Config agent
Use a separate step to have better flexibility in case a better approach is available
* merge: docker-actions-refactor into docker-test-refactor
* test docker wait scenarios
* fix(docker): $HOME variables is not being expanded
* fix(test): allow docker wait to work correctly
* fix(docker): do not use variables while using COPY
* fix(docker): allow to use zebrad as a command
* fix(cd): use test .yml from main
* fix(cd): Do not duplicate network values
The Dockerfile has an ARG with a default value of 'Mainnet', if this value is changed it will be done manually on a workflow_dispatch, making the ENV option a uneeded duplicate in this workflow
* fix(test): use bigger machine type for compute intensive tasks
* refactor(test): add tests in CI file
* fix(test): remove duplicated tests
* fix(test): typo
* test: build on .github changes temporarily
* fix(test): bigger machines have no effect on sync times
* feat: add an image to inherit from with zcash params
* fix(cd): use the right image name and allow push to test
* fix(cd): use the right docker target and remove extra builds
* refactor(docker): use cached zcash params from previous build
* fix(cd): finalize for merging
* imp(cd): add double safety measure for production
* fix(cd): use specific SHA for containers
* fix(cd): use latest gcloud action version
* fix(test): use the network as Mainnet and remove the uppercase from tests
* fix(test): run disk regeneration on specific file change
Just run this regeneration when changing the following files:
https://github.com/ZcashFoundation/zebra/blob/main/zebra-state/src/service/finalized_state/disk_format.rshttps://github.com/ZcashFoundation/zebra/blob/main/zebra-state/src/service/finalized_state.rshttps://github.com/ZcashFoundation/zebra/blob/main/zebra-state/src/constants.rs
* refactor(test): seggregate disks regeneration from tests
Allow to regenerate disks without running tests, and to run tests from previous disk regeneration.
Disk will be regenerated just if specific files were changed, or triggered manually.
Tests will run just if a disk regeneration was not manually triggered.
* fix(test): gcp disks require lower case conventions
* fix(test): validate logs being emmited by docker
GHA is transforming is somehow transforwing the variable to lowercase also, so we're changint it to adapt to it
* test
* fix(test): force tty terminal
* fix(test): use a one line command to test terminal output
* fix(test): always delete test instance
* fix(test): use short SHA from the PR head
Using the SHA from the base, creates confusion and it's not accurate with the SHA being shown and used on GitHub.
We have to keep both as manual runs with `workflow_dispatch` does not have a PR SHA
* fix(ci): do not trigger CI on docker changes
There's no impact in this workflow when a change is done in the dockerfile
* Instead of runing cargo test when the instance gets created, run this commands afterwards in a different step.
As GHA TTY is not working as expected, and workarounds does not play nicely with `gcloud compute ssh` actions/runner#241 (comment) we decided to get the container name from the logs, log directly to the container and run the cargo command from there.
* doc(test): document reasoning for new steps
* fix(test): increase machine type and ssh timeout
* fix(test): run tests on creation and follow container logs
This allows to follow logs in Github Actions terminal, while the GCP container is still running.
Just delete the instance when following the logs ends successfully or fails
* finalize(test): do not rebuild image when changing actions
* fix(test): run tests on creation and follow container logs
This allows to follow logs in Github Actions terminal, while the GCP container is still running.
Just delete the instance when following the logs ends successfully or fails
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
The keyword is `paths` and the actions were using `path`
That's the reason why most actions have been running, and there's been no impact in time savings
* fix(zcash-params): Do not update parameters image on PR
We should not update a direct dependency of our Docker image to be writeable by a PR from anywhere, a local branch or a fork branch, before that change has been approved by a human and merged to #main
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* refactor (cd): overall pipeline improvement
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
* refactor (cd): overall pipeline improvement
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
* refactor (cd): use newer google auth action
* fix (cd): use newer secret as gcp credential
* fix (docker): do not create extra directories
* fix (docker): ignore .github for caching purposes
* fix (docker): use latest rust
* fix: use a better name for manual deployment
* refactor (docker): use standard directories for executable
* fix (cd): most systems expect a "latest" tag
Caching from the latest image is one of the main reasons to add this extra tag. Before this commit, the inline cache was not being used.
* fix (cd): push the build image and the cache separately
The inline cache exporter only supports `min` cache mode. To enable `max` cache mode, push the image and the cache separately by using the registry cache exporter.
This also allows for smaller release images.
* fix (cd): remove unused GHA cache
We're leveraging the registry to cache the actions, instead of using the 10GB limits from Github Actions cache storage
* refactor (cd): use cargo-chef for caching rust deps
* fix (release): use newer debian to reduce vulnerabilities
* fix (cd): use same zone, region and service accounts
* fix (cd): use same disk size and type for all deployments
* refactor (cd): activate interactive shells
Use interactive shells for manual and test deployments. This allow greater flexibility if troubleshooting is needed inside the machines
* fix (docker): do not build with different settings
Compiling might be slow because different steps are compiling the same code 2-4 times because of the variations
* fix(cd): use Mainnet instead of mainnet
* fix(docker): remove tests as a runtime dependency
* fix(cd): use default service account with cloud-platform scope
* fix(cd): keep compatibility with gcr.io
To prevent conflicts between registries, and migrate when the time is right, we'll keep pushing to both registries and use github actions cache to prevent conflicts between artifacts.
* fix(docker): do not download zcash params twice
* feat(docker): add google OS Config agent
Use a separate step to have better flexibility in case a better approach is available
* fix(docker): allow to use zebrad as a command
* feat: add an image to inherit from with zcash params
* refactor(docker): use cached zcash params from previous build
* imp(cd): add double safety measure for production
* style: use global variables and don't double print
Remove repeated instances of global environment variables. Do not print ENV variables on the terminal as GitHub Actions already shows it.
* fix (actions): Use fixed major versions for actions
As actions get recurrent fixes, using a specific version causes more maintance on the pipelines.
On the other hand, using @master versions could make some action unreliable, as breaking changes might be included without further notice, and even change behavior on a daily basis.
* refactor: make better use of ENV variables
A whole step with refex was being used to extract different variables from GitHub's environment. This gets depecrated in favor of using `rlespinasse/github-slug-action@v4` which has slug URL variables.
A SLUG on a variable will:
- put the variable content in lower case
- replace any character by - except 0-9, a-z, ., and _
- remove leading and trailing - character
- limit the string size to 63 characters
This changes also takes care of using the Head or Base branch for deployments. This will allow us tomerge of workflows, as most steps on this deployment actions are very similar, with little variations between workflows.
* fix (actions): use secrets for sensitive information
* revert: use specific versions for dependabot
Reverting commit 8c93409902
* Segregate linting jobs from CI workflow
Lint on push to all branches, except for main, as this action will be required to merge.
Just run the lint action when a Rust file is changed, as it won't make sense to run it on other scenarios.
DRY with uneeded jobs
* Make actions dependable on changed files or folders
* Fix & add missing paths
* Revert changes removing cargo.lock and deny.toml checks
Also refactor this to use a more redable and change prone cargo-deny-action. And move this actions out of the clippy-deps job, as this are more related to CI than linting.
* Fix wrong indentation
* Add new configuration file from #3386
* Do not fail on licenses as this configuration is missing
* Do not add advisories features
Add advisories checks in a different PR
* Allow tests and coverage on PR series
If we only run CI on branches that are going to merge to main, then PR series become a lot harder to test. (Because each PR is based on the previous PR, not main.)
This causes the Mergify bot to commit to each PR, being also included in the squashed merge as an author.
As the queue merges the head branch (main) to latest tip before testing with the CI, having all those feature branches constantly updating with Mergify is not needed
* fix: Use the correct conditions and merge method
Mergifys Status Checks conditions are based on the job name, not the worflow name. As our worflows have dynamic names, each variant must be considered.
Squash merges are the default being used in the Zebra repo, so mergify must comply with this configuration.
Use condition operators for labels in each pull request rule; previously it was expecting both labels to be set. And update names accordingly.
* fix: Allow mergify to merge dependabot PRs
Also adapt dependabot's configuration to use the recently adapted labels
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* Add mergify merging queues
* Fix Mergify invalid configuration
* Improve adaptability to the actual workflow
Do not merge if the pull-request test is not green. Do not move draft PRs to the queue. And update keep all open PRs updated.
* Fix a typo on check-success condition