Pankaj Garg
00c4c30d72
Fix testnet bootup issue ( #2465 )
...
* Fix testnet bootup issue
* address review comments
2019-01-16 19:18:32 -08:00
Michael Vines
6015a0ff15
Add info command
2019-01-16 10:24:00 -08:00
Michael Vines
d5f27f9b1e
shellcheck
2019-01-09 22:06:58 -07:00
Michael Vines
86f19a3ab3
Propagate PS4 to prevent unintentional buildkite log unfolding
2019-01-09 22:02:31 -07:00
Michael Vines
be0eefb0af
Add timeout to prevent stuck bench-tps when a cluster goes bad
2019-01-09 19:21:53 -07:00
Michael Vines
28431ff22c
Add configurable RUST_LOG for ./net.sh sanity
2019-01-09 12:12:50 -08:00
Michael Vines
639bed2f6d
Reorder sanity.
...
1. Check for presence of nodes
2. Check for functioning RPC API
3. Then try the wallet
2019-01-09 12:05:30 -08:00
Michael Vines
eb37aa2bba
Kill monitoring scripts by process group to ensure a full shutdown
2019-01-09 11:59:01 -08:00
Michael Vines
048fe371aa
set -x for more detailed logs
2019-01-09 11:59:01 -08:00
Michael Vines
87c9af142f
Preserve config/ when skipSetup
2019-01-09 11:59:01 -08:00
Michael Vines
e0c68bf9ad
docs: -z is a common option
2019-01-08 21:11:43 -08:00
Michael Vines
aedab3f83f
Run sanity when previous ledger/setup is preserved
2019-01-08 21:11:43 -08:00
Michael Vines
1b7598e351
Add retries to RPC API probe
2019-01-08 08:50:51 -08:00
Michael Vines
1531a1777a
Add RPC API check
2018-12-24 22:51:36 -08:00
Michael Vines
04d46ea33f
Run oom-monitor as root
2018-12-24 22:51:36 -08:00
Michael Vines
f5bbc5e961
Fix args
2018-12-23 20:56:13 -08:00
Michael Vines
753a783ba9
Add solana user to adm group for /var/log/syslog access
2018-12-23 17:28:35 -08:00
Michael Vines
3c835b692b
Use netLogDir
2018-12-23 10:33:43 -08:00
Michael Vines
a6fd1ca3db
Add logs subcommand to fetch remote logs from each network node
2018-12-23 10:19:10 -08:00
Pankaj Garg
41f8764232
Ignore error while enabling nvidia persistence mode ( #2265 )
2018-12-21 12:37:51 -08:00
Pankaj Garg
4bf797c8f1
Load nvidia drivers on node startup ( #2263 )
...
* Load nvidia drivers on node startup
* added new script to enable nvidia driver persistent mode
* remove set -ex
2018-12-21 11:43:52 -08:00
Michael Vines
c3c955b02e
Build/install native programs within cargo-install-all.sh
2018-12-19 11:53:08 -08:00
Michael Vines
5c396c222a
Clean up install-native-programs.sh usage
2018-12-11 23:29:05 -08:00
Michael Vines
088bab61a4
Remove |cargo install| duplication
2018-12-11 23:29:05 -08:00
Michael Vines
b2d7b34082
Add |./net.sh update| command to live update all network nodes
2018-12-11 09:40:22 -08:00
Sathish
154e20484d
Use hostname in database if env is set ( #2101 )
2018-12-10 22:59:38 -08:00
Michael Vines
094f0a8be3
Leader rotation flag plumbing
2018-12-10 14:07:59 -08:00
Michael Vines
b2ddac610c
Add option to skip setup during cluster start
2018-12-10 07:47:15 -08:00
Michael Vines
b54b0a1d25
Document that -P is now available for |config|
2018-12-09 15:25:27 -08:00
Michael Vines
f5794de636
Clean up bootstrap leader terminology in comments and variable names
2018-12-09 15:25:27 -08:00
Carl
b9743957fa
Make directory to hold programs
2018-12-09 08:38:41 -08:00
Michael Vines
f5569e76db
Relocate native programs to deps/ subdirectory of the current executable
...
This layout is `cargo build` compatible, no post-build file moves
required.
2018-12-08 16:31:01 -08:00
Michael Vines
872a3317b5
Fully switch to bootstrap-leader for command-line args
2018-12-07 16:57:02 -08:00
Michael Vines
1db6a882bb
rsync of genesis ledger now works for non-snap deployments
2018-12-07 16:57:02 -08:00
Michael Vines
af11562627
Correct ledger path
2018-12-07 11:32:08 -08:00
Michael Vines
286f08f095
Drop old validator name, use fullnode instead
2018-12-07 11:32:08 -08:00
Michael Vines
6516c2532d
Ensure native programs for the correct platform are installed
2018-12-07 11:32:08 -08:00
Michael Vines
fa58da2401
Explicitly specific build variant when installing native programs
2018-12-07 11:32:08 -08:00
Michael Vines
70c149c7da
Rename leader/validator to bootstrap-leader/fullnode
...
Only rsyncing the genesis ledger snuck in here as well
2018-12-06 19:44:47 -08:00
Michael Vines
b34e197424
Add newline at end of file
2018-12-06 17:46:46 -08:00
Michael Vines
c4b8f0cd2f
bench-tps will now generate an ephemeral identity if not provided with one
...
Also simplify scripts as a result
2018-12-06 16:30:48 -08:00
carllin
aecb06cd2a
Update versions in install-libssl-compatibility.sh ( #2044 )
2018-12-06 15:57:30 -08:00
Michael Vines
f0fe089013
Adapt testnet-deploy metric datapoint names to {,bootnode-}fullnode
2018-12-06 08:04:33 -08:00
Michael Vines
a6312ba98f
Switch snap to bootstrap-fullnode/fullnode naming
2018-12-05 18:59:43 -08:00
Michael Vines
04a0652614
Generalize net/ from leader/validator to bootstrap-fullnode/fullnode
2018-12-05 17:11:16 -08:00
Michael Vines
5d80edd969
Properly check for failure (can't rely on `set -e` here)
2018-12-05 13:26:06 -08:00
Michael Vines
33a5d5fe93
Enable debug builds by default for better backtraces
2018-11-17 10:52:08 -08:00
Michael Vines
d96a6b42a5
Move drone into its own crate
2018-11-16 20:42:21 -08:00
carllin
cf95708c18
Set drone address to always be the initial network entry point ( #1847 )
...
* Set drone address to always be the initial network entry point, so that even when leaders rotate the client can still find the drone
* Extract drone address as a separate argument to bench-tps
* Add drone port to client.sh instead of setting it in bench-tps
* Add drone entrypoint to scripts
* Fix build error
2018-11-16 19:56:26 -08:00
Sathish
c973de1d76
Decouple log and metrics rate ( #1839 )
...
Use separate env for log and metrics rate.
Set default log level to WARN if unset.
2018-11-15 22:27:16 -08:00
Michael Vines
83fc3c10cf
Setup CUDA env for local builds
2018-11-15 08:00:52 -08:00
Michael Vines
017c281eaf
Remove CUDA support from Snap
2018-11-12 20:31:16 -08:00
Michael Vines
c5b1bc1128
Remove obsolete update-default-cuda.sh
2018-11-12 20:31:16 -08:00
Michael Vines
9e7b9487b0
perf-libs now drives setting CUDA_HOME
2018-11-12 18:49:15 -08:00
Michael Vines
851e012c6c
Upgrade EC2 image to 18.04 with CUDA 9.2 and 10
2018-11-12 15:17:34 -08:00
Michael Vines
7f76403d0a
Clean ~/solana during network start to avoid tripping over leftover files
2018-11-12 15:09:14 -08:00
Michael Vines
7ee4dec3f1
Upgrade GCE GPU image to 18.04
2018-11-12 12:18:50 -08:00
Michael Vines
c07d09c011
Add net/scp.sh for easier file transfer to/from network nodes
2018-11-12 11:48:53 -08:00
Michael Vines
3466f139a4
set -e shuffling
2018-11-11 16:24:36 -08:00
Michael Vines
def7d156f6
codemod --extensions sh '#!/usr/bin/env bash -e' '#!/usr/bin/env bash\nset -e'
2018-11-11 16:24:36 -08:00
Michael Vines
33aab094ef
codemod --extensions sh '#!/bin/bash' '#!/usr/bin/env bash'
2018-11-11 16:24:36 -08:00
Michael Vines
cf6f344ccc
Add CUDA_HOME env var to permit overriding the CUDA install location
2018-11-11 16:24:18 -08:00
Michael Vines
49014393e1
Be less fancy for bash 4.4 compat
2018-11-10 18:05:55 -08:00
Michael Vines
818d03c835
Bump earlyoom version
2018-11-10 15:56:17 -08:00
Michael Vines
b8261d7d83
Determine network version for tar and local deploys
2018-11-08 22:02:42 -08:00
Michael Vines
51ed48941b
Continue if docker0 is not present
2018-11-07 19:33:20 -08:00
Michael Vines
87ac549689
Work around AWS key management limitation
2018-11-07 18:48:27 -08:00
Michael Vines
f8f11b7f50
Remove docker0 interface if present
2018-11-07 18:23:24 -08:00
Michael Vines
82f914e0dc
Work around AWS boot check weirdness
2018-11-07 15:46:04 -08:00
Michael Vines
9359cc69d5
Invert gpu check
2018-11-07 14:44:40 -08:00
Michael Vines
b02b636b36
Support local tarball deploys
2018-11-07 14:44:40 -08:00
Michael Vines
a537154c28
Remove all cuda dependencies from release tarball beyond solana-fullnode-cuda
2018-11-07 14:44:40 -08:00
Michael Vines
16d23292dc
Improve error messages
2018-11-07 10:35:10 -08:00
Michael Vines
2ef8ebe111
AWS AMIs are region specific
2018-11-07 10:05:58 -08:00
Michael Vines
f8673931b8
Increase boot timeout
2018-11-07 08:32:15 -08:00
Michael Vines
dd4fb7aa90
Add AWS-based nets
2018-11-07 07:47:39 -08:00
Michael Vines
c4bc331663
Add support for using a release tar
2018-11-07 07:47:39 -08:00
Michael Vines
cd18a1b7db
t
2018-11-06 14:08:47 -08:00
Michael Vines
6aac096c77
Add timeout to prevent a stuck ssh
2018-11-06 14:08:28 -08:00
Michael Vines
7b58bd621a
Remove node check from client start-up
...
If the network loses a validator or two, it's the job of the sanity
check to detect this not the bench clients
2018-11-06 13:57:06 -08:00
Michael Vines
1a7830f460
Set imageName if G
2018-11-05 13:33:42 -08:00
Michael Vines
8041461a07
Bump EC2 validator machine type
2018-11-05 08:47:51 -08:00
Michael Vines
eae9372a5d
Upgrade GCP CPU-based testnet to 18.04
2018-11-04 19:18:47 -08:00
Michael Vines
f3b04894b9
Try harder to snap download
2018-11-03 00:29:13 +00:00
Pankaj Garg
85869552e0
Update testnet scripts to use release tar ball ( #1660 )
...
* Update testnet scripts to use release tar ball
* use curl instead of s3cmd
2018-10-30 18:05:38 -07:00
Pankaj Garg
3cc78d3a41
Added a new remote node configuration script to set rmem/wmem ( #1647 )
...
* Added a new remote node configuration script to set rmem/wmem
* Update common.sh for rmem/wmem configuration
2018-10-30 09:17:35 -07:00
Pankaj Garg
fbde9bb731
Run bench-tps for longer duration in testnet ( #1638 )
...
- Increased to 2+ hours
2018-10-29 15:03:08 -07:00
Pankaj Garg
7abd456d45
Increase rmem and wmem for remote nodes in testnet ( #1635 )
2018-10-29 13:04:54 -07:00
Michael Vines
489894cb32
Mention logs more
2018-10-27 08:49:52 -07:00
Pankaj Garg
dfde83bdce
Wildcard early OOM deb package revision ( #1554 )
2018-10-19 14:17:19 -07:00
Pankaj Garg
30c79fd40d
Change validator node machine type ( #1537 )
...
- The current nodes are using lower RAM compared to leader/clients
2018-10-17 17:16:50 -07:00
Pankaj Garg
32fc0cd7e9
Fix bug introduced during RUST_LOG escaping ( #1507 )
...
* Fix bug introduced during RUST_LOG escaping
- remote node configuration should not be quoted
* shellcheck disable SC2090
2018-10-15 16:49:22 -07:00
Pankaj Garg
9fc30f6db4
Escape RUST_LOG configuration in remote-node.sh ( #1489 )
...
* Escape RUST_LOG configuration in remote-node.sh
- If it was set to #, it was causing other parameters to be commented out
* escape other variables as well
* disabled shell check
* Fix shellcheck error
2018-10-13 13:35:54 -07:00
Michael Vines
5c523716aa
Ship native programs
2018-10-10 16:49:48 -07:00
Pankaj Garg
0a39722719
Add support to trigger testnet from a PR ( #1434 )
...
* Add support for different node counts
* Update variable names
* Delete network even after failures
* Add array for node counts
* Changed number of nodes to a space separated string of numbers
* Adjust number of nodes
* Snap will not be published if the env variable DO_NOT_PUBLISH_SNAP is set
* Address review comments
* Replaced influx db URL
2018-10-05 16:32:05 -07:00
Michael Vines
b1e941cab9
Return all instances
2018-10-01 07:51:48 -07:00
Pankaj Garg
7fb7839c8f
Configure GPU type/count from command line in GCE scripts ( #1376 )
...
* Configure GPU type/count from command line in GCE scripts
* Change CLI to input full leader machine type information with GPU
2018-09-27 11:55:56 -07:00
sakridge
3199f174a3
Add option to pass boot disk type to gce create ( #1308 )
2018-09-22 16:43:47 -07:00
Tyera Eulberg
f273351789
Add missing port number
2018-09-18 09:36:54 -06:00
Tyera Eulberg
0125163190
Remove wallet.sh, update entrypoint syntax for wallet network argument
2018-09-17 11:53:33 -06:00