Michael Vines
a6fd1ca3db
Add logs subcommand to fetch remote logs from each network node
2018-12-23 10:19:10 -08:00
Pankaj Garg
41f8764232
Ignore error while enabling nvidia persistence mode ( #2265 )
2018-12-21 12:37:51 -08:00
Pankaj Garg
4bf797c8f1
Load nvidia drivers on node startup ( #2263 )
...
* Load nvidia drivers on node startup
* added new script to enable nvidia driver persistent mode
* remove set -ex
2018-12-21 11:43:52 -08:00
Michael Vines
c3c955b02e
Build/install native programs within cargo-install-all.sh
2018-12-19 11:53:08 -08:00
Michael Vines
5c396c222a
Clean up install-native-programs.sh usage
2018-12-11 23:29:05 -08:00
Michael Vines
088bab61a4
Remove |cargo install| duplication
2018-12-11 23:29:05 -08:00
Michael Vines
b2d7b34082
Add |./net.sh update| command to live update all network nodes
2018-12-11 09:40:22 -08:00
Sathish
154e20484d
Use hostname in database if env is set ( #2101 )
2018-12-10 22:59:38 -08:00
Michael Vines
094f0a8be3
Leader rotation flag plumbing
2018-12-10 14:07:59 -08:00
Michael Vines
b2ddac610c
Add option to skip setup during cluster start
2018-12-10 07:47:15 -08:00
Michael Vines
b54b0a1d25
Document that -P is now available for |config|
2018-12-09 15:25:27 -08:00
Michael Vines
f5794de636
Clean up bootstrap leader terminology in comments and variable names
2018-12-09 15:25:27 -08:00
Carl
b9743957fa
Make directory to hold programs
2018-12-09 08:38:41 -08:00
Michael Vines
f5569e76db
Relocate native programs to deps/ subdirectory of the current executable
...
This layout is `cargo build` compatible, no post-build file moves
required.
2018-12-08 16:31:01 -08:00
Michael Vines
872a3317b5
Fully switch to bootstrap-leader for command-line args
2018-12-07 16:57:02 -08:00
Michael Vines
1db6a882bb
rsync of genesis ledger now works for non-snap deployments
2018-12-07 16:57:02 -08:00
Michael Vines
af11562627
Correct ledger path
2018-12-07 11:32:08 -08:00
Michael Vines
286f08f095
Drop old validator name, use fullnode instead
2018-12-07 11:32:08 -08:00
Michael Vines
6516c2532d
Ensure native programs for the correct platform are installed
2018-12-07 11:32:08 -08:00
Michael Vines
fa58da2401
Explicitly specific build variant when installing native programs
2018-12-07 11:32:08 -08:00
Michael Vines
70c149c7da
Rename leader/validator to bootstrap-leader/fullnode
...
Only rsyncing the genesis ledger snuck in here as well
2018-12-06 19:44:47 -08:00
Michael Vines
b34e197424
Add newline at end of file
2018-12-06 17:46:46 -08:00
Michael Vines
c4b8f0cd2f
bench-tps will now generate an ephemeral identity if not provided with one
...
Also simplify scripts as a result
2018-12-06 16:30:48 -08:00
carllin
aecb06cd2a
Update versions in install-libssl-compatibility.sh ( #2044 )
2018-12-06 15:57:30 -08:00
Michael Vines
f0fe089013
Adapt testnet-deploy metric datapoint names to {,bootnode-}fullnode
2018-12-06 08:04:33 -08:00
Michael Vines
a6312ba98f
Switch snap to bootstrap-fullnode/fullnode naming
2018-12-05 18:59:43 -08:00
Michael Vines
04a0652614
Generalize net/ from leader/validator to bootstrap-fullnode/fullnode
2018-12-05 17:11:16 -08:00
Michael Vines
5d80edd969
Properly check for failure (can't rely on `set -e` here)
2018-12-05 13:26:06 -08:00
Michael Vines
33a5d5fe93
Enable debug builds by default for better backtraces
2018-11-17 10:52:08 -08:00
Michael Vines
d96a6b42a5
Move drone into its own crate
2018-11-16 20:42:21 -08:00
carllin
cf95708c18
Set drone address to always be the initial network entry point ( #1847 )
...
* Set drone address to always be the initial network entry point, so that even when leaders rotate the client can still find the drone
* Extract drone address as a separate argument to bench-tps
* Add drone port to client.sh instead of setting it in bench-tps
* Add drone entrypoint to scripts
* Fix build error
2018-11-16 19:56:26 -08:00
Sathish
c973de1d76
Decouple log and metrics rate ( #1839 )
...
Use separate env for log and metrics rate.
Set default log level to WARN if unset.
2018-11-15 22:27:16 -08:00
Michael Vines
83fc3c10cf
Setup CUDA env for local builds
2018-11-15 08:00:52 -08:00
Michael Vines
017c281eaf
Remove CUDA support from Snap
2018-11-12 20:31:16 -08:00
Michael Vines
c5b1bc1128
Remove obsolete update-default-cuda.sh
2018-11-12 20:31:16 -08:00
Michael Vines
9e7b9487b0
perf-libs now drives setting CUDA_HOME
2018-11-12 18:49:15 -08:00
Michael Vines
851e012c6c
Upgrade EC2 image to 18.04 with CUDA 9.2 and 10
2018-11-12 15:17:34 -08:00
Michael Vines
7f76403d0a
Clean ~/solana during network start to avoid tripping over leftover files
2018-11-12 15:09:14 -08:00
Michael Vines
7ee4dec3f1
Upgrade GCE GPU image to 18.04
2018-11-12 12:18:50 -08:00
Michael Vines
c07d09c011
Add net/scp.sh for easier file transfer to/from network nodes
2018-11-12 11:48:53 -08:00
Michael Vines
3466f139a4
set -e shuffling
2018-11-11 16:24:36 -08:00
Michael Vines
def7d156f6
codemod --extensions sh '#!/usr/bin/env bash -e' '#!/usr/bin/env bash\nset -e'
2018-11-11 16:24:36 -08:00
Michael Vines
33aab094ef
codemod --extensions sh '#!/bin/bash' '#!/usr/bin/env bash'
2018-11-11 16:24:36 -08:00
Michael Vines
cf6f344ccc
Add CUDA_HOME env var to permit overriding the CUDA install location
2018-11-11 16:24:18 -08:00
Michael Vines
49014393e1
Be less fancy for bash 4.4 compat
2018-11-10 18:05:55 -08:00
Michael Vines
818d03c835
Bump earlyoom version
2018-11-10 15:56:17 -08:00
Michael Vines
b8261d7d83
Determine network version for tar and local deploys
2018-11-08 22:02:42 -08:00
Michael Vines
51ed48941b
Continue if docker0 is not present
2018-11-07 19:33:20 -08:00
Michael Vines
87ac549689
Work around AWS key management limitation
2018-11-07 18:48:27 -08:00
Michael Vines
f8f11b7f50
Remove docker0 interface if present
2018-11-07 18:23:24 -08:00
Michael Vines
82f914e0dc
Work around AWS boot check weirdness
2018-11-07 15:46:04 -08:00
Michael Vines
9359cc69d5
Invert gpu check
2018-11-07 14:44:40 -08:00
Michael Vines
b02b636b36
Support local tarball deploys
2018-11-07 14:44:40 -08:00
Michael Vines
a537154c28
Remove all cuda dependencies from release tarball beyond solana-fullnode-cuda
2018-11-07 14:44:40 -08:00
Michael Vines
16d23292dc
Improve error messages
2018-11-07 10:35:10 -08:00
Michael Vines
2ef8ebe111
AWS AMIs are region specific
2018-11-07 10:05:58 -08:00
Michael Vines
f8673931b8
Increase boot timeout
2018-11-07 08:32:15 -08:00
Michael Vines
dd4fb7aa90
Add AWS-based nets
2018-11-07 07:47:39 -08:00
Michael Vines
c4bc331663
Add support for using a release tar
2018-11-07 07:47:39 -08:00
Michael Vines
cd18a1b7db
t
2018-11-06 14:08:47 -08:00
Michael Vines
6aac096c77
Add timeout to prevent a stuck ssh
2018-11-06 14:08:28 -08:00
Michael Vines
7b58bd621a
Remove node check from client start-up
...
If the network loses a validator or two, it's the job of the sanity
check to detect this not the bench clients
2018-11-06 13:57:06 -08:00
Michael Vines
1a7830f460
Set imageName if G
2018-11-05 13:33:42 -08:00
Michael Vines
8041461a07
Bump EC2 validator machine type
2018-11-05 08:47:51 -08:00
Michael Vines
eae9372a5d
Upgrade GCP CPU-based testnet to 18.04
2018-11-04 19:18:47 -08:00
Michael Vines
f3b04894b9
Try harder to snap download
2018-11-03 00:29:13 +00:00
Pankaj Garg
85869552e0
Update testnet scripts to use release tar ball ( #1660 )
...
* Update testnet scripts to use release tar ball
* use curl instead of s3cmd
2018-10-30 18:05:38 -07:00
Pankaj Garg
3cc78d3a41
Added a new remote node configuration script to set rmem/wmem ( #1647 )
...
* Added a new remote node configuration script to set rmem/wmem
* Update common.sh for rmem/wmem configuration
2018-10-30 09:17:35 -07:00
Pankaj Garg
fbde9bb731
Run bench-tps for longer duration in testnet ( #1638 )
...
- Increased to 2+ hours
2018-10-29 15:03:08 -07:00
Pankaj Garg
7abd456d45
Increase rmem and wmem for remote nodes in testnet ( #1635 )
2018-10-29 13:04:54 -07:00
Michael Vines
489894cb32
Mention logs more
2018-10-27 08:49:52 -07:00
Pankaj Garg
dfde83bdce
Wildcard early OOM deb package revision ( #1554 )
2018-10-19 14:17:19 -07:00
Pankaj Garg
30c79fd40d
Change validator node machine type ( #1537 )
...
- The current nodes are using lower RAM compared to leader/clients
2018-10-17 17:16:50 -07:00
Pankaj Garg
32fc0cd7e9
Fix bug introduced during RUST_LOG escaping ( #1507 )
...
* Fix bug introduced during RUST_LOG escaping
- remote node configuration should not be quoted
* shellcheck disable SC2090
2018-10-15 16:49:22 -07:00
Pankaj Garg
9fc30f6db4
Escape RUST_LOG configuration in remote-node.sh ( #1489 )
...
* Escape RUST_LOG configuration in remote-node.sh
- If it was set to #, it was causing other parameters to be commented out
* escape other variables as well
* disabled shell check
* Fix shellcheck error
2018-10-13 13:35:54 -07:00
Michael Vines
5c523716aa
Ship native programs
2018-10-10 16:49:48 -07:00
Pankaj Garg
0a39722719
Add support to trigger testnet from a PR ( #1434 )
...
* Add support for different node counts
* Update variable names
* Delete network even after failures
* Add array for node counts
* Changed number of nodes to a space separated string of numbers
* Adjust number of nodes
* Snap will not be published if the env variable DO_NOT_PUBLISH_SNAP is set
* Address review comments
* Replaced influx db URL
2018-10-05 16:32:05 -07:00
Michael Vines
b1e941cab9
Return all instances
2018-10-01 07:51:48 -07:00
Pankaj Garg
7fb7839c8f
Configure GPU type/count from command line in GCE scripts ( #1376 )
...
* Configure GPU type/count from command line in GCE scripts
* Change CLI to input full leader machine type information with GPU
2018-09-27 11:55:56 -07:00
sakridge
3199f174a3
Add option to pass boot disk type to gce create ( #1308 )
2018-09-22 16:43:47 -07:00
Tyera Eulberg
f273351789
Add missing port number
2018-09-18 09:36:54 -06:00
Tyera Eulberg
0125163190
Remove wallet.sh, update entrypoint syntax for wallet network argument
2018-09-17 11:53:33 -06:00
Michael Vines
155ee8792f
Add GPU support to ec2-provider
2018-09-17 09:26:25 -07:00
Michael Vines
f89f121d2b
Add AWS EC2 support
2018-09-17 09:26:25 -07:00
Pankaj Garg
be7cce1fd2
Tweak GCE scripts for higher node count ( #1229 )
...
* Tweak GCE scripts for higher node count
- Some validators were unable to rsync config from leader when
the node count was high (e.g. 25). Looks like the leader node was
getting more rsync requests in parallel than it count handle.
- This change staggers the validators bootup, and rsync time
* Address review comments
2018-09-14 17:17:08 -07:00
Michael Vines
ee74b367ce
Add docker install script
2018-09-12 17:09:37 -07:00
Michael Vines
f06113500d
bench-tps/net sanity: add ability to check for unexpected extra nodes
2018-09-12 15:38:57 -07:00
Michael Vines
af3eb5a16c
.sh
2018-09-11 11:29:49 -07:00
Pankaj Garg
1c17c6dd2b
Report UDP network statistics ( #1176 )
...
* Report UDP network statistics
Fixes #1093
* Address review comments
* Address additional review comments
* Fix shellcheck errors
2018-09-10 15:52:08 -07:00
Michael Vines
ebcac3c2d1
Use a common solana user on all testnet instances
2018-09-08 22:34:26 -07:00
Michael Vines
5afcdcbbe6
More log grooming
2018-09-08 14:16:34 -07:00
Michael Vines
3840b4b516
Groom log output
2018-09-08 14:10:18 -07:00
Michael Vines
7aeb6d642b
Display log file
2018-09-08 13:59:45 -07:00
Michael Vines
1d6c4aacae
Retry rsync a couple times before failing
2018-09-08 13:59:45 -07:00
Michael Vines
9f5c86e60c
Install earlyoom at gce instance startup
2018-09-08 13:59:45 -07:00
Michael Vines
9f413fd656
Establish net/scripts/... for better scoping
2018-09-08 13:59:45 -07:00
Michael Vines
c3af0d9d25
Improve client.log
2018-09-07 21:20:00 -07:00
Michael Vines
932c994dc9
Use new bench-tps command-line args
2018-09-07 21:20:00 -07:00
Michael Vines
ddd1871840
Install libssl1.1 for solanalabs/rust docker image compat
2018-09-07 19:57:41 -07:00
Michael Vines
db825788fa
Document how to get ssh access into CD testnets
2018-09-07 19:41:13 -07:00
Michael Vines
73a8441add
/var/snap is not writable by most users
2018-09-07 17:41:20 -07:00
Rob Walker
51b27779c9
client changes for TODOs and looping ( #1138 )
...
* remove client.sh from snap
* default to ephemeral instead of ~/.config key
* rework CLI for bench-tps
* remote multinode-demo stuff from remote-client.sh
* remove multinode-demo from remote-sanity and localnet-sanity
2018-09-08 07:07:10 +09:00
Michael Vines
0d945e6a92
Groom testnet-sanity logging
2018-09-07 12:45:48 -07:00
Michael Vines
1090254ba5
Add datapoints for leader/validator start
2018-09-07 12:45:48 -07:00
Michael Vines
ee682d5bc3
Move wallet-sanity.sh out of multinode-demo/
2018-09-07 12:01:43 -07:00
Michael Vines
506a81e8cc
Assume -y
2018-09-07 12:01:43 -07:00
Michael Vines
dcb30a8489
Delete leader node first
2018-09-07 12:01:43 -07:00
Michael Vines
a2631e89f6
Use consistent style
2018-09-07 12:01:43 -07:00
Michael Vines
ab208ddb77
Clean up arg handling
2018-09-07 12:01:43 -07:00
Michael Vines
09a48d773a
Run bench-tps in a tmux
2018-09-07 12:01:43 -07:00
Michael Vines
d252f7f687
Revert "Default to 10 validators"
...
This reverts commit ed5fbaef06
.
2018-09-07 12:01:43 -07:00
Michael Vines
53e16f68d9
Improve error handling
2018-09-06 20:57:05 -07:00
Michael Vines
ed5fbaef06
Default to 10 validators
2018-09-06 20:46:49 -07:00
Michael Vines
66ff602659
Rewrite ci/testnet-{deploy,sanity}.sh in terms of net/ primitives
2018-09-06 19:54:39 -07:00
Michael Vines
5a57d9b5d9
de-y
2018-09-06 19:54:39 -07:00
Michael Vines
03e87e4169
Add more metrics
2018-09-06 19:54:39 -07:00
Michael Vines
31dee553d5
Split start/version reporting
2018-09-06 19:54:39 -07:00
Michael Vines
9ca6a2d25b
Configure boot disk size
2018-09-06 19:54:39 -07:00
Michael Vines
a3178c3bc7
Remove unused name tag
2018-09-06 19:54:39 -07:00
Michael Vines
aa07bdfbaa
Optionally suppress delete confirmation
2018-09-06 19:54:39 -07:00
Michael Vines
eaef9be710
Clarify -f
2018-09-06 19:54:39 -07:00
Michael Vines
cae345b416
Allow - in prefix
2018-09-06 19:54:39 -07:00
Michael Vines
acb1171422
Add -e option
2018-09-06 19:54:39 -07:00
Rob Walker
fdc48d521c
use USER instead of whoami ( #1134 )
...
* use USER instead of whoami
make gcloud_FigureRemoteUsername robust against unsolicited output
(that I get on login ;) )
validate --prefix argument
* Update gcloud.sh
2018-09-07 00:18:05 +09:00
Michael Vines
6560b0e2cc
s/whoami/id -un/
2018-09-05 14:26:21 -07:00
Michael Vines
ec38dba209
GCE leader nodes can now be provisioned with a static IP address
2018-09-05 14:26:21 -07:00
Michael Vines
8d87627a49
t
2018-09-05 09:09:50 -07:00
Michael Vines
aacf27fb76
Add convienience link to current Snap log files
2018-09-05 09:02:02 -07:00
Michael Vines
a51536d107
Add log tail hint
2018-09-05 09:02:02 -07:00
Michael Vines
e2e569cb43
Set rsync url for local deployments
2018-09-05 09:02:02 -07:00
Michael Vines
017eb10e76
Add file header doc
2018-09-05 09:02:02 -07:00
Michael Vines
f50aeb0e58
Always add perf-libs to LD_LIBRARY_PATH
2018-09-05 09:02:02 -07:00
Michael Vines
48c19d3100
Enable cargo features to be specified
2018-09-05 09:02:02 -07:00
Michael Vines
aaf0a23134
Add Tips section
2018-09-05 09:02:02 -07:00
Michael Vines
89db85dbf9
Work around concurrent |gcloud compute ssh| terminal issue
2018-09-05 09:02:02 -07:00
Michael Vines
e677cda027
Private IP networks now work, and are the default
2018-09-05 09:02:02 -07:00
Michael Vines
db9219ccc8
Improve error monitoring
2018-09-05 09:02:02 -07:00
Michael Vines
06fd945f85
Set node config correctly
2018-09-05 09:02:02 -07:00
Michael Vines
6ad4a81123
s/_/-/g in filenames
2018-09-05 09:02:02 -07:00
Michael Vines
bcaa0fdcb1
net/ can now deploy Snaps
2018-09-05 09:02:02 -07:00
Michael Vines
2cb1375217
Run gcloud_PrepInstancesForSsh in parallel
2018-09-05 09:02:02 -07:00
Michael Vines
9365a47d42
Employ a startup script
2018-09-05 09:02:02 -07:00
Michael Vines
6ffe205447
Add -g option
2018-09-05 09:02:02 -07:00
Michael Vines
ec3e62dd58
Add net/ sanity
2018-09-05 09:02:02 -07:00
Michael Vines
fa07c49cc9
net/ can now deploy Snaps
2018-09-05 09:02:02 -07:00
Michael Vines
7e2b65374d
gce instance types are now configurable
2018-09-05 09:02:02 -07:00
Michael Vines
8e39465700
Drop .sh extension to hide from shellcheck
2018-09-05 09:02:02 -07:00
Michael Vines
43b4207101
Run oom-monitor in net/ testnets
2018-09-05 09:02:02 -07:00
Michael Vines
ff991b87da
Add support for deploying from non-Linux machines
2018-09-05 09:02:02 -07:00
Michael Vines
399caf343c
Morph gce_multinode-based scripts into net/
2018-09-05 09:02:02 -07:00