* refactor(ci): use docker in docker
This is a workaround for an issue related to disk partitioning, caused by a GCP service called Konlet, while mounting the cached disks to the VM and then to the container
* fix(build): persist docker login credentials
* fix(ci): get sync height from docker logs instead of gcp
* try: use gha cache for faster building
* fix(ci): mount disk in container to make it available in vm
* fix(build): do not invalidate cache between images
* try(docker): invalidate cache as less as possible
* fix(ci): GHA terminal is not a TTY
* fix(build): do not ignore entrypoint.sh
* fix
* fix(ci): mount using root priveleges
* fix(ci): use existing disk as cached state
* fix(ci): wait for disks to get mounted
* force rebuild
* fix failed force
* fix(ci): some tests does not use a cached state
* fix(ci): do not name boot and attached disk the same
* fix(ci): attach a disk to full sync, to snapshot the state
* fix(ci): use appropiate grep text depending on the test
* reduce diff
* fix(ci): use correct GCP disk source attribute
* imp(ci): reduce diff
* fix(ci): revert wrong deletion
* fix: revert uneeded changes
* fix: reduce main diff
* fix
* fix(ci): reduce diff
* fix(ci): garbage collect instances no matter the status
As we're not going to reuse test instances, the safest method to apply is to always delete this instances if they fail, get skipped or succeed running a workflow
* Apply suggestions from code review
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* docs(ci): imrpove comment
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* refactor(ci): test building in a separate workflow
* force a change
* force a change
* fix(ci): send the correct variables to the reusable build
* fix(ci): variables are not allowed
* fix(ci): conditions are not allowed as input
* fix(ci): use expected value
* refactor(build): simplify the use of other dockerfiles
* fix(cd): depend on docker build yml
* fix(cd): use main branch as image name
* imp(actions): remove uneeded variable repetition
* imp(build): remove unused variables
* imp(actions): rename the image building workflow
Not all images are for zebra execution as we also have one for zcash-params
* fix(ci): add dependable workflow in paths filters
* docs(ci): remove TODO as this won't be needed at least an issue arises
* docs(ci): CARGO_INCREMENTAL can decrease build time when running from a cache
* fix: revert forced changes
* fix(build): remove unused build inputs in zcash-params
* imp(cd): as this is the production image, use the executable name
* imp(ci): reduce log level to improve speed
Co-authored-by: teor <teor@riseup.net>
* imp(ci): use the correct name for the workflow
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: teor <teor@riseup.net>
* fix(ci): do not delete instances from `main` branch on merge
* fix(ci): do not delete instances on merge
This was creating an unintended behavior, and so far instances are being cleaned up in its corresponding workflow.
* feat(ci): run cached state rebuilds in main branch
* fix(ci): allow the PR/branch name in the disk name
Move the hight information to the disk description, to reduce the name length
* fix(ci): add missing SHA
* fix(ci): regenerate chekpoint cached state on main
This will automatically regenerate the disk when a merge is completed on main
* tmp(ci): do not duplicate sync test at/after merge
This temporarily ensure the test just runs in the main branch, ensuring we can track it easier
* fix(ci): correctly use lowered network caps
In the Test workflow we were using a different approach than the one being used in the Full sync test.
Also, in the Full sync test the variable was LOWER_NET_NAME, but NETWORK was being used in the disk name, with caps.
* imp(ci): get state version from local constants.rs
* imp(ci): use the same get name approach
* fix(ci): use the correct name for state version variable
* imp(ci)!: use different disk names for cached states
Disk states synced to canopy and synced to the chain tip should have different names to reference correctly on actual and coming tests the needed disk.
* imp(ci): test-stateful-sync no longer depends on regenerate-stateful-disks
* Apply suggestions from code review
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* fix(ci): use a better name for network string conversion
* Revert "Apply suggestions from code review"
This reverts commit cbbfaf4e9c.
* fix: do not get log information if sync was skipped
* fix(ci): do not lower the variable name
* fix(ci): use the same lowering case for network everywhere
* test: more .dockerignore conditions
* fix: use the right approach to lower caps
* remove extra .dockerignore
* trigger a change for stateful disk regeneration
* imp(ci): use `checkpoint` as the disk reference
* revert wrong delete
* fix(ci): add INSTANCE_ID and correct logging message
* imp(ci): add `v` prefix to state version number
* fix(ci): remove typo from logging message to get the height
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* style(ci): comply with https://json.schemastore.org/github-workflow.json
Some substituions were harder to make as files were not standardized
* fix(mergify): use correct name for macos
* style(actions): revert to single quotes
* style: lint dependabot and mergify conf files
* style: remove conditions with missing context
* imp(lint): automate GH Actions linting
* fix(lint): some actions need to be triggered by PR event
* fix(lint): consider all workflow YAMLs
* Use the same paths in the patch file
* revert: keep condition as is
* add TODO
* fix: add missing checkpoint_sync input
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* Change OutputLocation to contain a TransactionLocation
* Change OutputLocation reads from the database
* Update some doc comments
* Update some TODOs
* Change deleting spent UTXOs and updating spent balances
* Change adding new UTXOs and adding their values to balances
* Disable dead code warnings
* Update snapshot test code
* Update round-trip tests for OutputLocations
* Update snapshot test data
* Increment the database format version
* Remove a redundant try_into()
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Refactor redundant code
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* ci: attempt at fixing 'Regenerate stateful disks'
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* fix(ci): validate exit code
* fix(ci): use a single step
* fix: run from gcp command
* fix: remove variable substitution
* fix: escaping characters
* fix: handle bash issues with variable expansion
* fix: seggregate execution into multiple steps
* fix: the command is a string
* fix: move docker ignore to use caching
* revert panic
* fix: add panic and fix exit code command
* revert: remove panic
* fix: apply the exit code to all gcp tests
* imp(ci): use single line exit
* Clean up the CODEOWNERS file
We moved the dockerignore file into the docker directory,
so it doesn't need a separate entry any more.
Co-authored-by: teor <teor@riseup.net>
* Create disk image after a successful full sync test
* Extract full sync height and name zebrad cached state with it
* Read 500 lines to extract sync height
* Restrict log query to just the container output and fix regex syntax for ubuntu
* Explicitly search logs in descending time
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
Only run the state rebuild job if the database format version has (likely) changed.
If we have accidentally changed the format, but not changed the version,
we want to run with the old cached state, so this job fails.
If we change the state path without changing the version,
this job will take a few hours, because it will do a full rebuild.
When a PR is created and an image is built in a branch, the cache is also pushed to a `buildcache` tag.
As all PRs are using the same tag, sometimes the cache for a specific branch gets invalidated and makes it take longer on further pushes.
This fix might make the first commit take longer, but further ones will be faster if no changes in the code are applied.
* Update test.patch.yml with lightwalletd job
* Remove a workflow condition that will always be false
In general, patch workflows need the
opposite conditions to the original workflow.
But in this case, we know the result of the
condition will always be true, so we can just delete it.
Co-authored-by: teor <teor@riseup.net>
* fix(actions): use a specific shortening length for SHAs
The rlespinasse/github-slug-action now works without checking out the code, reduce time and improving security with following actions.
This requires to specify the GITHUB_SHA_SHORT variable length, as git uses 8 by default, but docker uses 7 by default.
* fix(actions): target correct rlespinasse/github-slug-action version
* fix(actions): just use major version
* fix(actions): github-slug-action is not being correctly referenced
* refactor(ci): use improved OIDC authentication
* fix(ci): standardize OIDC on all required jobs
* fix: wrong indentation
* fix(ci): remove non existing depency in clean job
GitHub's GHA cache gets invalidated at 10Gb, which is very easy to hit when we're building multiple times a day with several commits.
Instead use the registry, which won't get invalidated until a change is identified in the build process.
* refactor(test): reuse same GCP instance on a single PR
This also ensures the deployments are faster, and we only delete the instance when merging or closing the PR, instead of doing it on each push to the PR
* fix(deploy): add zone to updates
* fix: typo
* fix(ci): improve conditions for updates
* fix(deploy): delete old deployments instead of reusing it
* fix(deploy): keep delete command after run
* fix(deploy): always create an instance
* fix(deploy): delete disks on every delete command.
* imp(ci): use better id name
* Update .github/workflows/test.yml
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* imp: handle errors correctly on deletion
* fix: do not hide valid errors
* fix: edge case where the container is not ready yet
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* doc(db): fix some comments
* refactor(db): split disk serialization types into their own module
* refactor(db): split the disk format into modules
* doc(db/test): explain the RON serialization format
* fix(ci): only run the full sync test on mergify queue PRs
* fix inconsistent indenting and syntax
* Update .github/workflows/test-full-sync.yml
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* fix: check TEST_FAKE_ACTIVATION_HEIGHTS at runtime
* fix(tests): add TEST_FAKE_ACTIVATION_HEIGHTS variable
This variable ensures the test is activated in the `test-fake-activation-heights` step
* fix(docker): do not run specific tests by default in entrypoint.sh
* fix(test): remove extra TEST_FULL_SYNC argument
* imp(timeout): wait for an average build time
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* fix: add missing job key
* fix(arm64): bump timeout to build without cache
* fix(deployment): apply changes made in 5004c4d3a1
* fix: remove uneeded condition
* refactor(tests): make vm names refer the test name
This also adds a build step for full sync, as there won't be a reference image when using workflow_dispatch
* fix(deployment): testing depends on the built image
* refactor(test): decouple full sync from other tests
As the full sync requires to be run just once and isolated, we're running this test in a separate workflow, after a PR has been approved.
* fix: revert to previous conditions in job regenerate-stateful-disks
* fix(condition): get disk sha if regeneration is not executed
* fix: typo
* Update .github/workflows/test-full-sync.yml
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* fix(build): bump build time for arm64
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* refactor(build): use OCI Image Format Specification for labels
This should also fix when an image gets built multiple times using the cache, as each image differs in labels
* refactor(tags): use PR context sha and ref
Remove the needed of PR Head SHA and Ref, as those can cause conflict depending on how the branch name has been established
* fix(ci): remove an unused trigger path
* doc(ci): explain lightwalletd trigger paths
* fix(test): check for adityapk00/lightwalletd behaviour in test harness
* fix(ci): work around buildx command error
* fix(ci): revert the workaround
* add(actions): lightwalletd continous integrations
* refactor(actions): build lightwalletd and reuse it in zebra
- Download lightwalletd source code
- Create a new Dockerfile for lightwalletd
- Use lightwalletd binary in Zebra's image
- Create a specific step to build/update lightwalletd
- Add lightwalletd integration test to the test suite
- Remove lightwalletd.yml, as it was harder to control
* refactor(docker): organize Dockerfiles and remove unused
Fixes: #3344
* feat(build): add arrm64 support
* fix(build): do not install google-compute-engine in arm64
This package is not available for this platform
* fix(build): do not build arm64 for tests
* fix(condition): indent for better visibility
* fix(condition): wrong use of operators
* ci(test): re-run tests when snapshot data changes
* fix(ci): rebuild state when disk format changes
* fix(ci): rebuild rust docs when code or dependencies change
* doc(ci): explain why we run jobs when files change
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* add(tests): full sync test
* fix(test): add build
* fix(deploy): escape double dashes '--' correctly
* fix(test): remove unexpected --no-capture arg
error: Found argument '--nocapture' which wasn't expected, or isn't valid in this context
* refactor(docker): use default executable as entrypoint
* refactor(startup): add a custom entrypoint
* fix(test): add missing TEST_FULL_SYNC variable
* test(timeout): use the biggest machine
* fix
* fix(deploy): use latest successful image
* typo
* refactor(docker): generate config file at startup
* revert(build): changes were made to docker
* fix(docker): send variables correctly to the entrypoint
* test different conf file approach
* fix(env): add RUN_TEST env variable
* ref: use previous approach
* fix(color): use environment variable
* fix(resources): use our normal machine size
* fix(ci): double CPU and RAM for full sync test
* fix(test): check for zebrad test output in the correct order
The mempool is only activated once, so we must check for that log first.
After mempool activation, the stop regex is logged at least once.
(It might be logged before as well, but we can't rely on that.)
When checking that the mempool didn't activate,
wait for the `zebrad` command to exit,
then check the entire log.
* fix(ci): run full sync test with full compiler optimisations
* fix(tests): reintroduce tests and run full sync on approval
* fix(tests): reduce the changelog
Co-authored-by: teor <teor@riseup.net>
* add(actions): lightwalletd continous integrations
* refactor(actions): build lightwalletd and reuse it in zebra
- Download lightwalletd source code
- Create a new Dockerfile for lightwalletd
- Use lightwalletd binary in Zebra's image
- Create a specific step to build/update lightwalletd
- Add lightwalletd integration test to the test suite
- Remove lightwalletd.yml, as it was harder to control
* fix(build): remove extra port being exposed
* fix(lightwalletd): test should be after `--` in cargo test
* revert(lint): do not lint external code as it can be confusing
* fix(test): lightwalletd_integration test is not ignored
* docs(docker): clarify the addition of unused args
* refactor(docker): organize Dockerfiles and remove unused
Fixes: #3344
* fix(actions): activate workflows on correct path changes
* test
* revert previous commit
* feat(build): add arm64 support with cross-compilation (#3659)
* feat(build): add arrm64 support
* fix(build): do not install google-compute-engine in arm64
This package is not available for this platform
* fix(build): do not build arm64 for tests
* fix(changes): reduce changelog
* Revert "feat(build): add arm64 support with cross-compilation (#3659)"
This reverts commit 291e00c405.
* refactor(state): move disk_db reads to a new zebra_db module
* refactor(state): make finalized value pool method names consistent
* refactor(state): split database writes into the zebra_db module
* refactor(state): move the block batch method to DiskWriteBatch
* refactor(state): actually add the zebra_db module
Unfortunately, I've lost the interim changes to this file,
so this commit might be the only one that compiles.
* refactor(state): add a newly created file to the cached state CI job
* Run Coverage collection on main
Resolves#3533
* fix(coverage): just run coverage on specific file changes to main
Co-authored-by: Gustavo Valverde <gustavo@iterativo.do>
* fix(dependencies): update an unused duplicate dependency exception
This duplicate was removed by PR #3572, but other duplicates still exist.
* feat(ci): check for duplicate dependencies with optional features off
* fix(mergify, actions): use better names and require tests
* feat(queue): do not update the actual PR, create a draft
Do not allow to update/rebase the original pull request to check its mergeability. Create a draft pull request instead.
This doesn't add Mergify as a co-author
* feat(queue): do not interrupt already running queues
Our queues might take more than 5 hours even if the priority is low.
Do not allow interrupting the ongoing speculative checks when a pull request with higher priority enters in the queue.
* fix(mergify): move 'allow' attributes to queue_rules
* fix(mergify): attributes are not conditions
* refactor(state): split the disk_format module
* refactor(ci): add the new disk_db file to the state CI list
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* fix(ci): clarify ignored test name
`--include-ignored` runs all tests, including tests
that would normally be ignored.
`-Zunstable-options` enables all unstable options,
but it doesn't do anything by itself.
There is a lot of overlap with "test-all" in this job,
which we might want to fix in a future PR.
* fix(ci): remove unused -Zunstable-options
`--include-ignored` is now stable, so `unstable-options` is not needed.
* fix(test): delete a redundant test
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* fix(test): use the short SHA from actual run if valid
* fix(test): if condition must evaluate to a single false
* fix(test): do not run logs and upload if not needed
* imp(test): allow test stateful sync after disk regeneration
This takes is fast enough, so it shouldn't do any harm if run just after a ~3 hours test
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Dependabot creates branches with versions using a dot notation, and some tests fails because of this
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* refactor (cd): overall pipeline improvement
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
* refactor (cd): overall pipeline improvement
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
* refactor (cd): use newer google auth action
* fix (cd): use newer secret as gcp credential
* fix (docker): do not create extra directories
* fix (docker): ignore .github for caching purposes
* fix (docker): use latest rust
* fix (cd): bump build timeout
* fix: use a better name for manual deployment
* refactor (docker): use standard directories for executable
* fix (cd): most systems expect a "latest" tag
Caching from the latest image is one of the main reasons to add this extra tag. Before this commit, the inline cache was not being used.
* fix (cd): push the build image and the cache separately
The inline cache exporter only supports `min` cache mode. To enable `max` cache mode, push the image and the cache separately by using the registry cache exporter.
This also allows for smaller release images.
* fix (cd): remove unused GHA cache
We're leveraging the registry to cache the actions, instead of using the 10GB limits from Github Actions cache storage
* refactor (cd): use cargo-chef for caching rust deps
* fix: move build system deps before cargo cheg cook
* fix (release): use newer debian to reduce vulnerabilities
* fix (cd): use same zone, region and service accounts
* fix (cd): use same disk size and type for all deployments
* refactor (cd): activate interactive shells
Use interactive shells for manual and test deployments. This allow greater flexibility if troubleshooting is needed inside the machines
* refactor (test): use docker artifact from registry
Instead of using a VM to SSH into in to build and test. Build in GHA (to have the logs available), run the workspace tests in GHA, and just run the sync tests in GCP
Use a cintainer VM with zebra's image directly on it, and pass the needed parameters to run the Sync past mandatory checkpoint.
* tmp (cd): bump timeout for building from scratch
* tmp (test): bump build time
* fix (cd, test): bump build time-out to 210 minutes
* fix (docker): do not build with different settings
Compiling might be slow because different steps are compiling the same code 2-4 times because of the variations
* revert (docker): do not fix the rust version
* fix (docker): build on the root directory
* refactor(docker): Use base image commands and tools
* fix (cd): use correct variables & values, add build concurrency
* fix(cd): use Mainnet instead of mainnet
* imp: remove checkout as Buildkit uses the git context
* fix (docker): just Buildkit uses a .dockerignore in a path
* imp (cd): just use needed variables in the right place
* imp (cd): do not checkout if not needed
* test: run on push
* refactor(docker): reduce build changes
* fix(cd): not checking out was limiting some variables
* refactor(test): add an multistage exclusive for testing
* fix(cd): remove tests as a runtime dependency
* fix(cd): use default service account with cloud-platform scope
* fix(cd): revert checkout actions
* fix: use GA c2 instead of Preview c2d machine types
* fix(actions): remove workflow_dispatch from patched actions
This causes GitHub confusion as it can't determined which of the actions using workflow_dispatch is the right one
* fix(actions): remove patches from push actions
* test: validate changes on each push
* fix(test): wrong file syntax on test job
* fix(test): add missing env parameters
* fix(docker): Do not rebuild to download params and run tests
* fix(test): setup gcloud and loginto artifact just when needed
Try not to rebuild the tests
* fix(test): use GCP container to sync past mandatory checkpoint
* fix(test): missing separators
* test
* fix(test): mount the available disk
* push
* refactor(test): merge disk regeneration into test.yml
* fix(cd): minor typo fixes
* fix(docker): rebuild on .github changes
* fix(cd): keep compatibility with gcr.io
To prevent conflicts between registries, and migrate when the time is right, we'll keep pushing to both registries and use github actions cache to prevent conflicts between artifacts.
* fix(cd): typo and scope
* fix(cd): typos everywhere
* refactor(test): use smarter docker wait and keep old registry
* fix(cd): do not constraint the CPUs for bigger machines
* revert(cd): reduce PR diff as there's a separate one for tests
* fix(docker): add .github as it has no impact on caching
* fix(test): run command correctly
* fix(test): wiat and create image if previous step succeded
* force rebuild
* fix(test): do not restrict interdependant steps based on event
* force push
* feat(docker): add google OS Config agent
Use a separate step to have better flexibility in case a better approach is available
* fix(test): remove all hardoced values and increase disks
* fix(test): use correct commands on deploy
* fix(test): use args as required by google
* fix(docker): try not to invalidate zebrad download cache
* fix(test): minor typo
* refactor(test): decouple jobs for better modularity
This also allows faster tests as testing Zunstable won't be a dependency and it can't stop already started jobs if it fails.
* fix(test): Do not try to execute ss and commands in one line
* fix(test): do not show undeeded information in the terminal
* fix(test): sleep befor/after machine creation/deletion
* fix(docker): do not download zcash params twice
* feat(docker): add google OS Config agent
Use a separate step to have better flexibility in case a better approach is available
* merge: docker-actions-refactor into docker-test-refactor
* test docker wait scenarios
* fix(docker): $HOME variables is not being expanded
* fix(test): allow docker wait to work correctly
* fix(docker): do not use variables while using COPY
* fix(docker): allow to use zebrad as a command
* fix(cd): use test .yml from main
* fix(cd): Do not duplicate network values
The Dockerfile has an ARG with a default value of 'Mainnet', if this value is changed it will be done manually on a workflow_dispatch, making the ENV option a uneeded duplicate in this workflow
* fix(test): use bigger machine type for compute intensive tasks
* refactor(test): add tests in CI file
* fix(test): remove duplicated tests
* fix(test): typo
* test: build on .github changes temporarily
* fix(test): bigger machines have no effect on sync times
* feat: add an image to inherit from with zcash params
* fix(cd): use the right image name and allow push to test
* fix(cd): use the right docker target and remove extra builds
* refactor(docker): use cached zcash params from previous build
* fix(cd): finalize for merging
* imp(cd): add double safety measure for production
* fix(cd): use specific SHA for containers
* fix(cd): use latest gcloud action version
* fix(test): use the network as Mainnet and remove the uppercase from tests
* fix(test): run disk regeneration on specific file change
Just run this regeneration when changing the following files:
https://github.com/ZcashFoundation/zebra/blob/main/zebra-state/src/service/finalized_state/disk_format.rshttps://github.com/ZcashFoundation/zebra/blob/main/zebra-state/src/service/finalized_state.rshttps://github.com/ZcashFoundation/zebra/blob/main/zebra-state/src/constants.rs
* refactor(test): seggregate disks regeneration from tests
Allow to regenerate disks without running tests, and to run tests from previous disk regeneration.
Disk will be regenerated just if specific files were changed, or triggered manually.
Tests will run just if a disk regeneration was not manually triggered.
* fix(test): gcp disks require lower case conventions
* fix(test): validate logs being emmited by docker
GHA is transforming is somehow transforwing the variable to lowercase also, so we're changint it to adapt to it
* test
* fix(test): force tty terminal
* fix(test): use a one line command to test terminal output
* fix(test): always delete test instance
* fix(test): use short SHA from the PR head
Using the SHA from the base, creates confusion and it's not accurate with the SHA being shown and used on GitHub.
We have to keep both as manual runs with `workflow_dispatch` does not have a PR SHA
* fix(ci): do not trigger CI on docker changes
There's no impact in this workflow when a change is done in the dockerfile
* Instead of runing cargo test when the instance gets created, run this commands afterwards in a different step.
As GHA TTY is not working as expected, and workarounds does not play nicely with `gcloud compute ssh` actions/runner#241 (comment) we decided to get the container name from the logs, log directly to the container and run the cargo command from there.
* doc(test): document reasoning for new steps
* fix(test): increase machine type and ssh timeout
* fix(test): run tests on creation and follow container logs
This allows to follow logs in Github Actions terminal, while the GCP container is still running.
Just delete the instance when following the logs ends successfully or fails
* finalize(test): do not rebuild image when changing actions
* fix(test): run tests on creation and follow container logs
This allows to follow logs in Github Actions terminal, while the GCP container is still running.
Just delete the instance when following the logs ends successfully or fails
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
The keyword is `paths` and the actions were using `path`
That's the reason why most actions have been running, and there's been no impact in time savings
* fix(zcash-params): Do not update parameters image on PR
We should not update a direct dependency of our Docker image to be writeable by a PR from anywhere, a local branch or a fork branch, before that change has been approved by a human and merged to #main
Co-authored-by: Deirdre Connolly <durumcrustulum@gmail.com>
* refactor (cd): overall pipeline improvement
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
* refactor (cd): overall pipeline improvement
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
* refactor (cd): use newer google auth action
* fix (cd): use newer secret as gcp credential
* fix (docker): do not create extra directories
* fix (docker): ignore .github for caching purposes
* fix (docker): use latest rust
* fix: use a better name for manual deployment
* refactor (docker): use standard directories for executable
* fix (cd): most systems expect a "latest" tag
Caching from the latest image is one of the main reasons to add this extra tag. Before this commit, the inline cache was not being used.
* fix (cd): push the build image and the cache separately
The inline cache exporter only supports `min` cache mode. To enable `max` cache mode, push the image and the cache separately by using the registry cache exporter.
This also allows for smaller release images.
* fix (cd): remove unused GHA cache
We're leveraging the registry to cache the actions, instead of using the 10GB limits from Github Actions cache storage
* refactor (cd): use cargo-chef for caching rust deps
* fix (release): use newer debian to reduce vulnerabilities
* fix (cd): use same zone, region and service accounts
* fix (cd): use same disk size and type for all deployments
* refactor (cd): activate interactive shells
Use interactive shells for manual and test deployments. This allow greater flexibility if troubleshooting is needed inside the machines
* fix (docker): do not build with different settings
Compiling might be slow because different steps are compiling the same code 2-4 times because of the variations
* fix(cd): use Mainnet instead of mainnet
* fix(docker): remove tests as a runtime dependency
* fix(cd): use default service account with cloud-platform scope
* fix(cd): keep compatibility with gcr.io
To prevent conflicts between registries, and migrate when the time is right, we'll keep pushing to both registries and use github actions cache to prevent conflicts between artifacts.
* fix(docker): do not download zcash params twice
* feat(docker): add google OS Config agent
Use a separate step to have better flexibility in case a better approach is available
* fix(docker): allow to use zebrad as a command
* feat: add an image to inherit from with zcash params
* refactor(docker): use cached zcash params from previous build
* imp(cd): add double safety measure for production
* style: use global variables and don't double print
Remove repeated instances of global environment variables. Do not print ENV variables on the terminal as GitHub Actions already shows it.
* fix (actions): Use fixed major versions for actions
As actions get recurrent fixes, using a specific version causes more maintance on the pipelines.
On the other hand, using @master versions could make some action unreliable, as breaking changes might be included without further notice, and even change behavior on a daily basis.
* refactor: make better use of ENV variables
A whole step with refex was being used to extract different variables from GitHub's environment. This gets depecrated in favor of using `rlespinasse/github-slug-action@v4` which has slug URL variables.
A SLUG on a variable will:
- put the variable content in lower case
- replace any character by - except 0-9, a-z, ., and _
- remove leading and trailing - character
- limit the string size to 63 characters
This changes also takes care of using the Head or Base branch for deployments. This will allow us tomerge of workflows, as most steps on this deployment actions are very similar, with little variations between workflows.
* fix (actions): use secrets for sensitive information
* revert: use specific versions for dependabot
Reverting commit 8c93409902
* Segregate linting jobs from CI workflow
Lint on push to all branches, except for main, as this action will be required to merge.
Just run the lint action when a Rust file is changed, as it won't make sense to run it on other scenarios.
DRY with uneeded jobs
* Make actions dependable on changed files or folders
* Fix & add missing paths
* Revert changes removing cargo.lock and deny.toml checks
Also refactor this to use a more redable and change prone cargo-deny-action. And move this actions out of the clippy-deps job, as this are more related to CI than linting.
* Fix wrong indentation
* Add new configuration file from #3386
* Do not fail on licenses as this configuration is missing
* Do not add advisories features
Add advisories checks in a different PR
* Allow tests and coverage on PR series
If we only run CI on branches that are going to merge to main, then PR series become a lot harder to test. (Because each PR is based on the previous PR, not main.)
* Download and load Sprout parameters using zcash_proofs
Also update some librustzcash dependencies, to avoid duplicate dependencies.
* Update upstream orchard to avoid a compilation error
* Skip librustzcash batch refactor for now, to avoid compilation errors
* Change the cache ID, so we actually cache Sprout
* Move existing file checks into zcash_proofs
* Add a 1 hour timeout to parameter file downloads
* Give other tasks priority, before spawning the download task
* Update to the latest version of our modified librustzcash fork
* Change the cache key for Sprout
* Add 40 minutes to CI timeouts for occasional sprout downloads
* Update to zcash_proofs with split downloads
* Check file sizes to help debug parameter load failures in zcash_proofs
* Start the second download once the first has finished in zcash_proofs
* Document the parameter download task
* Stop hashing existing files twice
* Move dependency checks to the clippy job
* Split the fake activation heights into their own job
* Fix expected types
* Minimise proptest cases on Windows, macOS, and coverage
We don't expect proptests to fail on different platforms.
* Replace Zcash parameters crates with pre-downloaded local parameter files
* Download Zcash parameters using the `zcashd` script in CI and Docker
* Add a zcash_proofs dependency to zebra-consensus
* Download Sapling parameters using zcash_proofs, rather than fetch-params.sh
* Add a new `zebrad download` subcommand
This command isn't required for nomrmal usage.
But it's useful when testing, or launching multiple Zebra instances.
* Use `zebrad download` in CI to pre-download parameters
* Log a helpful hint if downloading fails
* Allow some duplicate dependencies currently hidden by orchard
* Spawn a separate task to download Groth16 parameters
* Run the parameter download with code coverage
This avoids re-compining Zebra with and without coverage.
* Update Cargo.lock after rebase
* Try to pass `download` as an argument to `zebrad` in coverage CI
* Fix copy and paste comment typos
* Add path and download examples, like zcash_proofs
* Download params in CI just like zcash_proofs does
* Delete a redundant build step
* Implement graceful shutdown for zebrad start
* Send coverage summary to /dev/null when getting the params path
* Use the correct parameters path and download commands in CI
* Explain pre-downloads
* Avoid calling params_folder twice
* Rename parameter types and methods for consistency
```sh
fastmod SaplingParams SaplingParameters zebra*
fastmod Groth16Params Groth16Parameters zebra*
fastmod PARAMS GROTH16_PARAMETERS zebra*
fastmod params_folder directory zebra*
```
And a manual variable name tweak.
* rustfmt
* Remove a redundant coverage step
Co-authored-by: Janito Vaqueiro Ferreira Filho <janito.vff@gmail.com>
* Update `tower` to version `0.4.9`
Update to latest version to add support for Tokio version 1.
* Replace usage of `ServiceExt::ready_and`
It was deprecated in favor of `ServiceExt::ready`.
* Update Tokio dependency to version `1.13.0`
This will break the build because the code isn't ready for the update,
but future commits will fix the issues.
* Replace import of `tokio::stream::StreamExt`
Use `futures::stream::StreamExt` instead, because newer versions of
Tokio don't have the `stream` feature.
* Use `IntervalStream` in `zebra-network`
In newer versions of Tokio `Interval` doesn't implement `Stream`, so the
wrapper types from `tokio-stream` have to be used instead.
* Use `IntervalStream` in `inventory_registry`
In newer versions of Tokio the `Interval` type doesn't implement
`Stream`, so `tokio_stream::wrappers::IntervalStream` has to be used
instead.
* Use `BroadcastStream` in `inventory_registry`
In newer versions of Tokio `broadcast::Receiver` doesn't implement
`Stream`, so `tokio_stream::wrappers::BroadcastStream` instead. This
also requires changing the error type that is used.
* Handle `Semaphore::acquire` error in `tower-batch`
Newer versions of Tokio can return an error if the semaphore is closed.
This shouldn't happen in `tower-batch` because the semaphore is never
closed.
* Handle `Semaphore::acquire` error in `zebrad` test
On newer versions of Tokio `Semaphore::acquire` can return an error if
the semaphore is closed. This shouldn't happen in the test because the
semaphore is never closed.
* Update some `zebra-network` dependencies
Use versions compatible with Tokio version 1.
* Upgrade Hyper to version 0.14
Use a version that supports Tokio version 1.
* Update `metrics` dependency to version 0.17
And also update the `metrics-exporter-prometheus` to version 0.6.1.
These updates are to make sure Tokio 1 is supported.
* Use `f64` as the histogram data type
`u64` isn't supported as the histogram data type in newer versions of
`metrics`.
* Update the initialization of the metrics component
Make it compatible with the new version of `metrics`.
* Simplify build version counter
Remove all constants and use the new `metrics::incement_counter!` macro.
* Change metrics output line to match on
The snapshot string isn't included in the newer version of
`metrics-exporter-prometheus`.
* Update `sentry` to version 0.23.0
Use a version compatible with Tokio version 1.
* Remove usage of `TracingIntegration`
This seems to not be available from `sentry-tracing` anymore, so it
needs to be replaced.
* Add sentry layer to tracing initialization
This seems like the replacement for `TracingIntegration`.
* Remove unnecessary conversion
Suggested by a Clippy lint.
* Update Cargo lock file
Apply all of the updates to dependencies.
* Ban duplicate tokio dependencies
Also ban git sources for tokio dependencies.
* Stop allowing sentry-tracing git repository in `deny.toml`
* Allow remaining duplicates after the tokio upgrade
* Use C: drive for CI build output on Windows
GitHub Actions uses a Windows image with two disk drives, and the
default D: drive is smaller than the C: drive. Zebra currently uses a
lot of space to build, so it has to use the C: drive to avoid CI build
failures because of insufficient space.
Co-authored-by: teor <teor@riseup.net>
* Add default deny.toml for "cargo deny check bans"
`cargo deny init`
* Delete unused "cargo deny" config entries
Also cleanup trailing whitespace.
* Deny duplicate crates and unexpected crate sources
Allow the current set of duplicates and sources,
with references to the tickets that will fix them.
* Check for duplicate dependencies in CI
Also check for:
- unexpected crate sources
- outdated Cargo.lock
(required for accurate duplicate and source checks)
* Revert CI name changes so required statuses pass
* Fix ticket for sentry-tracing
* ZIP-401 weighted random mempool eviction
* rename zcash.mempool.total_cost.bytes to zcash.mempool.cost.bytes
Co-authored-by: teor <teor@riseup.net>
* Remove duplicated lines
* Add cost() method to UnminedTx
Update serialization failure messages
* More docs quoting ZIP-401 rules
* Change mempool::Storage::new() to handle Copy-less HashMap, HashSet
* mempool: tidy cost types and evict_one()
* More consensus rule docs
* Refactor calculating mempool costs for Unmined transactions
* Add a note on asympotic performance of calculating weights of txs in mempool
* Bump test mempool / storage config to avoid weighted random cost limits
* Use mempool tx_cost_limit = u64::MAX for some tests
* Remove failing tests for now
* Allow(clippy::field-reassign-with-default) because of a move on a type that doesn't impl Copy
* Fix mistaken doctest formatting
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
* Increase test timeout for Windows builds
Co-authored-by: teor <teor@riseup.net>
Co-authored-by: Conrado Gouvea <conrado@zfnd.org>
Also only run the zebrad acceptance tests on macOS.
Re-running the compiler and test binaries for unused crates is slow in CI.
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* Add validation of ZIP-221 and ZIP-244 commitments
* Apply suggestions from code review
Co-authored-by: teor <teor@riseup.net>
* Add auth commitment check in the finalized state
* Reset the verifier when comitting to state fails
* Add explanation comment
* Add test with fake activation heights
* Add generate_valid_commitments flag
* Enable fake activation heights using env var instead of feature
* Also update initial_tip_hash; refactor into progress_from_tip()
* Improve comments
* Add fake activation heights test to CI
* Fix bug that caused commitment trees to not match when generating partial arbitrary chains
* Add ChainHistoryBlockTxAuthCommitmentHash::from_commitments to organize and deduplicate code
* Remove stale comment, improve readability
* Allow overriding with PROPTEST_CASES
* partial_chain_strategy(): don't update note commitment trees when not needed; add comment
Co-authored-by: teor <teor@riseup.net>
* Only use -t flag to docker run, set SSH keep alive
* Remove SSH flag for now
* Add ssh flag back to test.yml gcloud compute ssh command
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Previously, Zebra's cached state workflows would run all of Zebra's
tests, but they would ignore the results for most tests. They would only
fail if the mainnet cached state test failed.
After this fix, the tests fail if any test or build step fails.
* Remove checkout credentials from CD action
* Remove checkout credentials from CI action
* Remove checkout credentials from coverage action
* Remove checkout credentials from docs action
* Remove checkout credentials from manual deploy action
* Remove checkout credentials from test action
* Remove checkout credentials from zcashd action
Previously, Zebra made ci-success a required check for merges to main. And then we made ci-success depend on a bunch of other CI checks.
But this doesn't work as expected, because if the dependent checks fail, ci-success is skipped, and the branch protection rules allow the branch to be merged to main.
Use Powershell syntax to set ZEBRA_SKIP_NETWORK_TESTS on Windows.
Also skip the entire large sync test step on Ubuntu and
Windows, because the tests are skipped anyway due to
ZEBRA_SKIP_NETWORK_TESTS. This saves some
compilation time.
We used to always run the CI workflow on push/merge to #main and at some point stopped;
we still link to the status of this workflow on #main from our README. I think we should bring it back.
Also allows manual triggering of the workflow, which can come in handy if you are working
on a branch but haven't opened a PR yet.
* remove windows conditional
* fully separate tests from large tests
* add rust beta to new large test jobs
* increase build time for windows
* disable cargo increment
* temporally disable sync_large_checkpoints from CI
* Allow large checkpoint sync tests only on ubuntu and macOS
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* switch to new llvm source based coverage
* upload artifact and simplified
* filter out irrelevant dependency coverage
* enable the correct filters on coverage
* correctly specify all binaries
* remove sparse flag from coverage
* update the coverage script organization
* fix typo in coverage script
* Create and mount persistent disk to store zebrad state, update runner container config to use
* Enable checkpoint sync in zebrad image config
* Lower state memory cache from 500MB to 50MB
* Upgrade host to n2-standard-4
* Bump zebrad-cache disk size to 100GB
* Copy zebrad as the tests are compiled with a hardcoded path to it
* Rename all debug binaries for easy invocation
* Name state cache disk, use the correct path to binaries
* Create volume and all that jazz on instance creation
Otherwise there's a lot of on-instance commands to do that is just handled by this shortcut.
* Explicitly mount the state cache and cleanup test instance
* Wait for zebra-test container to start then attach
* Always clean up even if the tests step fails
* Keep fast sleep but only print 'waiting' once
* Run large checkpoint sync tests in CI
* Improve test child output match error context
* Add a debug_stop_at_height config
* Use stop at height in acceptance tests
And add some restart acceptance tests, to make sure the stop at
height feature works correctly.
* export proptest impls for use in downstream crates
* add testjob for disabled feature in zebra-chain
* run rustfmt
* try to fix github actions syntax
* differentiate name
* prove that github action tests zebra-chain build without features
* revert change from last commit now that test is running
* remove accidentally introduced newline
* Update .github/workflows/ci.yml
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
Co-authored-by: Deirdre Connolly <deirdre@zfnd.org>
* attempt to use zcashconsensus crate in zebra-script
* boop
* update verify fn to use zebra types
* a bit more cleanup
* cleanup
* more
* beep boop
* fix renamed member
* cleaning
* get a real branch id
* remove as of yet unneeded api
* Update zebra-chain/src/transaction.rs
* Update zebra-chain/src/transaction.rs
* more cleanup
* oops wrong dep section
* use a tuple to communicate arg association
* update to use published version of zcash_script
* fix new compiler error
* install llvm on windows
* fix bindgen bug????
* try to get docker file to win
* okay try everything
* fix windows build maybe
* always download choco
* fix paths for moved types
* try a different error message
* try convenience script
* try installing just llvm
* add back one more
* try installing some headers
* try a diff package
* try everything
* remove the minimum
* try newer docker builder image
* cleanup docker image
* cleanup extra ci step
* Add skeleton of eventual zebra book
* reorg sections
* restore file and reorg book a little
* try setting up a firebase deployment
* allow firebase ci to work on test
* download mdbook
* fix book path
* use newer version of mdbook
* remove event hook for book branch pre merge
* Apply suggestions from code review
Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>
Co-authored-by: Henry de Valence <hdevalence@hdevalence.ca>
* Fix variable substitutions in CD workflow and gcloud build config
* Docker needs everything lowercase
* Store container image in GCR
* Don't use GITHUB_REPOSITORY
Using tower-batch-based async pattern.
Now the Verifier is agnostic of redjubjub SigTypes. Updated tests to
generate sigs of both types and batch verifies the whole batch.
Resolves#407
* Add step ids, better names
* Split out Clippy to its own job
* If coverage goes down, don't fail the build
* Go back to tarpaulin
* bump version of tarpaulin
* config tarpaulin
* Add CI workflow similar to other zebra
* Bump cache TTL to 24hours
* Expand image name to include full repo owner/repo-name/branch-name
* Force to lowercase because google container registry demands it
This may not be universally shell compatible
* Use bash as gcloud action container entrypoint