Commit Graph

569 Commits

Author SHA1 Message Date
Michael Vines 356f246a74 Remove get-/show- prefix from cli commands 2020-01-21 08:43:07 -07:00
sakridge b7b68ecdba
Add partition testing documentation (#7739) 2020-01-10 15:32:43 -08:00
Michael Vines 447fe48d2a
Revert "Add a stand-alone gossip node on the blocksteamer instance"
This reverts commit a217920561.

This commit is causing trouble when the TdS cluster is reset and
validators running an older genesis config are still present.
Occasionally an RPC URL from an older validator will be selected,
causing a new node to fail to boot.
2020-01-04 16:42:12 -07:00
Michael Vines a217920561 Add a stand-alone gossip node on the blocksteamer instance
The blocksteamer instance is the TdS cluster entrypoint.  Running an
additional solana-gossip node allows other participants to join a
cluster even if the validator node on the blocksteamer instance goes down.
2020-01-02 17:20:59 -07:00
sakridge 6f7d0c6928
Move cleanup to a script so it doesn't kill itself (#7603) 2019-12-23 14:31:57 -08:00
Justin Starry 9bd5888f5e
Fix broken internal-nodes-stake-lamports arg in scripts (#7581) 2019-12-19 21:38:03 -05:00
Trent Nelson 554188e88e
Colo - Node install scripts missing latest user requests (#7540)
* Enable user GPU profiling while installing CUDA

* Install heaptrack
2019-12-17 19:00:12 -05:00
Dan Albert 8176470b7f
Add pubkey from new buildkite agent instance 2019-12-17 18:00:15 -05:00
Tyera Eulberg 3513f4ee84
Rename drone to faucet (#7508) 2019-12-16 14:05:17 -07:00
Dan Albert 9ac112104c
Searhc across command line for pattern to kill (#7475) 2019-12-13 21:08:41 -05:00
Trent Nelson 42f2b14a74 Colo: Fix lockfile syntax (#7432)
Logical AND for [ is -a, [[ is &&.
2019-12-11 15:32:38 -05:00
Dan Albert 12d471e2da
Update default node balance to 500 SOL and default stake to 1 SOL (#7411) 2019-12-10 17:52:35 -05:00
anatoly yakovenko 96c08cd295 add pubkey for colo (#7409)
automerge
2019-12-10 14:18:57 -08:00
Rob Walker f3633a2e04
rent for testnet (#7407) 2019-12-10 13:51:19 -08:00
Dan Albert f4a089cc26
Allow delay between validator booting and client start (#7297)
* Allow delay between validator booting and client start
2019-12-05 21:03:26 -05:00
Pankaj Garg 9d7a926a8b
Tune UDP rmem/wmem using sys-tuner daemon (#7273) 2019-12-04 15:17:24 -08:00
Pankaj Garg 75d505c431 Don't hardcode username in sys-tuner (#7234)
automerge
2019-12-04 11:39:26 -08:00
Pankaj Garg d357192025
Fix ssh connection error due to too many authentication failures (#7229) 2019-12-03 15:53:12 -08:00
Pankaj Garg 076e384bb5
Tool to tune system parameters like PoH service priority (#7155)
* New daemon to tune system parameters like PoH service priority

* fixes for Linux

* integrate with poh_service

* fixes

* address review comments

* remove `dead_code` directive
2019-12-02 16:46:46 -08:00
Justin Starry eaa3e87eb0 Support passphrases in keygen (#7134)
* Support passphrases in keygen

* remove short

* Update solana_keygen calls
2019-11-25 21:33:15 -07:00
Trent Nelson d8bc828839
Colo: Refactor remote command dispatch for create and delete (#7092)
* Colo: Dump escaping mess in remote script templates

* Colo: Rename script templates so shellcheck can get 'em

* shellcheck and nits

* Brace all of the things

* Consistent heredoc tags

* Use bash built-in square bracketing consistently

* simplify logic
2019-11-25 10:32:17 -07:00
Justin Starry b8cd0a1bc0
Allow secure keypair input for `solana-archiver` and `solana` cli tools (#7106)
* Add seed phrase keypair recover to archiver

* Add seed phrase keypair to cli with ASK keyword

* cli main tweaks
2019-11-23 11:55:43 -05:00
Justin Starry ce8d37984d
Allow secure keypair input for solana-validator cli (#7080)
* Allow secure keypair input for solana-validator cli

* feedback

* Add --skip-mnemonic-validation

* Update --identity to --identity-keypair

* Use struct instead of tuple

* Fix dependencies

* cargo fmt

* Add basic tests

* Use `seed phrase` instead of `mnemonic`

* Update passphrase prompt
2019-11-22 10:20:40 -05:00
Michael Vines ee6b11d36d Remove ability to deploy custom programs (#7070)
automerge
2019-11-20 15:37:42 -08:00
Michael Vines b0271394cd
Clean up --gossip-port argument (#7067)
--gossip-port now specifies exactly that, the gossip port to use.  The
new --gossip-host argument can be used to specify the DNS name/IP
address for gossip if --entrypoint is not supplied (when --entrypoint is
supplied, the gossip address is automatically set to the node's ip
address as observed by the entrypoint)
2019-11-20 15:21:34 -07:00
Justin Starry 95c137158f Fix gce.sh info (#7054)
automerge
2019-11-19 17:49:25 -08:00
Dan Albert 2d7c7b0982
Fix missed rebase on net.sh (#7037) 2019-11-19 10:22:30 -05:00
Pankaj Garg 955aaef2e6
Fixes to net-shaper and net.sh (#7002)
* Fixes to net-shaper and net.sh

* fixes to default filters and cleanup
2019-11-18 11:33:33 -08:00
Dan Albert 6e04a646ba
Gossip entrypoint is now option of spy not solana-gossip (#7006) 2019-11-17 11:36:24 -05:00
Michael Vines 5ab70c4e97
genesis: rename mint account to faucet account and make it optional (#6990) 2019-11-15 14:50:26 -07:00
Dan Albert 946e937549
Create development vs softlaunch environment hooks into net scripts (#6974) 2019-11-15 15:18:45 -05:00
Pankaj Garg d565ec7968
Fixes to net-shaper, and net.sh option to start/stop shaper (#6981)
* Fixes to net-shaper, and net.sh option to start/stop shaper

* fix shellcheck

* more shellchecks
2019-11-15 12:10:48 -08:00
Sagar Dhawan 3ce6248f8c
Add CPU and RAM usage to Metrics (#6968)
* Add CPU usage to Metrics

* Add RAM usage and rename to system-stats

* Shellcheck

* Remove SC exception

* Address review comments
2019-11-14 20:36:34 -08:00
Dan Albert f27c11ccd8
Add Azure testnet to automation (#6911)
* Add Azure testnet to automation
2019-11-14 09:14:53 -05:00
Michael Vines f116cdeed9
Add validator catchup command (#6922) 2019-11-13 15:58:14 -07:00
Sunny Gleason 9246bee12b
feat: default 8gb hard memory limit for redis (#6913) 2019-11-13 11:09:20 -05:00
Dan Albert bb2fa9957a
Increase default AWS instance size to match GCE and Azure (#6773) 2019-11-12 12:27:59 -05:00
Dan Albert bb158a9b48
Add provider specific self destruct timeouts (#6894) 2019-11-12 12:21:24 -05:00
Ryo Onodera b971eeca4b
Add ryoqun to ssh authorized keys (#6860) 2019-11-11 17:12:24 +09:00
sakridge b14e61ff79
Filter any net/log* directory from rsync (#6857) 2019-11-09 13:38:17 -08:00
Michael Vines 68eafb3f30
Ensire config dir exists 2019-11-08 22:18:21 -07:00
Michael Vines 2649f6bdd6
Avoid excessive log/ relinking 2019-11-08 21:57:50 -07:00
Justin Starry 9807f47d4e
Rename genesis block to genesis config (#6816) 2019-11-08 23:56:57 -05:00
Michael Vines 9c00ad9ff2
Remove some low-hanging TODOs (#6839) 2019-11-08 16:41:36 -07:00
Michael Vines 151adab739
earlyoom now works on reboots (#6841) 2019-11-08 16:40:38 -07:00
Justin Starry 807af8670e
Clean up net logs (#6813) 2019-11-08 10:25:17 -05:00
Michael Vines f7b6e777bf
Revert "Clean up net/log symlinks (#6794)" (#6809)
This reverts commit 68353b7e57.
2019-11-07 22:15:45 -07:00
Justin Starry 68353b7e57
Clean up net/log symlinks (#6794) 2019-11-07 23:45:19 -05:00
Sagar Dhawan 20a52f153b Fix iftop not being stopped correctly (#6803)
automerge
2019-11-07 17:03:14 -08:00
Pankaj Garg 09e8124017 Tool to reconfigure netem on testnet (#6781)
automerge
2019-11-07 11:14:33 -08:00
Michael Vines 87ba66b6d0
Add net/ support for reusable identity keypairs (#6783) 2019-11-06 21:14:05 -07:00
Rob Walker a1fe6265fd
use pubkeys in genesis (#6750) 2019-11-06 11:18:25 -08:00
sakridge ec50c20400
Add time in net/logs path (#6701) 2019-11-06 10:43:12 -08:00
Trent Nelson a91bf296d7
Add some addition packages to DC installer scripts (#6755)
* Add 'cmake' to default DC node installer

* Add 'sysstat' to default DC node installer

For 'iostat'

* Add 'perf' to default DC node installer

* Add 'iftop' to default DC node installer
2019-11-06 09:48:45 -07:00
Pankaj Garg 8993b15248 Integrated use of netem with testnet scripts (#6746)
automerge
2019-11-05 15:04:06 -08:00
Michael Vines fba1af6ea9
ledger-tool can now load a ledger snapshot (#6729) 2019-11-04 22:14:55 -07:00
Sagar Dhawan 3133ee2401
Fix limited iftop output and failure to stop iftop (#6723)
* Fix limited iftop output and failure to stop iftop

* Shellcheck

* Ignore shellcheck
2019-11-04 18:12:07 -08:00
Trent Nelson d085c8626f GCE: Add instances self-destruct (#6363)
automerge
2019-11-04 10:30:26 -08:00
Dan Albert 7b6e3a23be
Add new pubkey to auth keys (#6687) 2019-11-01 14:44:10 -06:00
Dan Albert 1cc8956f74
Get Azure provider working again (#6659)
* Wait for node creation before continuing

* Programatically set networking rules

* Add network security group to nodes upon creation

* shellcheck
2019-11-01 14:43:31 -06:00
TristanDebrunner e6c8bfd008
Add --use-move flag to cargo-install-all.sh and net/net.sh (#6670) 2019-11-01 07:53:30 -07:00
Michael Vines f131255066
Add ~/.cargo/bin to PATH (#6641) 2019-10-30 19:41:24 -07:00
Michael Vines 7bb224f54a Install ag on nodes (#6634)
automerge
2019-10-30 16:43:16 -07:00
Tyera Eulberg 4ec95043d7
Update sol:lamport ratio to base-10 (#6611)
* Update sol:lamport ratio

* Update various SOL quantities in bash scripts
2019-10-29 20:03:48 -06:00
Michael Vines d952b38f93
Ensure nofiles is not capped at 1024 on a node reboot 2019-10-28 23:21:34 -07:00
Michael Vines 1e2ab89b47
Ensure redis-server is started on a reboot 2019-10-28 20:58:46 -07:00
Dan Albert 9ee65009cd
Implement allowing validator boot failure into automation (#6589)
* Pass allow boot failures through create AND start

* Extend sleep timeout to all nodes

* Add 100 node testcase

* Reduce consistent sleep
2019-10-28 16:43:40 -06:00
Trent Nelson 96e209db49
Colo: Don't fail without a message (#6558) 2019-10-28 09:20:49 -06:00
Michael Vines 0c14ca58c7 Invoke on-reboot from cloud startup script to avoid racing with cron (#6579)
automerge
2019-10-27 10:56:16 -07:00
Pankaj Garg e174af7838 Use iftop to collect network bandwidth usage (#6560)
* Use iftop to collect network bandwidth usage

* fix shellcheck

* more shellchecks

* review comments
2019-10-26 00:06:46 -07:00
Michael Vines be74801236
Add NET_NUM_xyz variables 2019-10-25 23:00:14 -07:00
Michael Vines e966c96644
Disable sigverify on blockstreamer node
This node get overloaded at high TPS trying to manage both a validator
and the blockexplorer.  Reduce it's workload by turning off sigverify,
which doesn't really matter since this node doesn't even vote
2019-10-25 21:33:08 -07:00
Dan Albert a2a9d54985
Increase node start stagger (#6566) 2019-10-25 17:35:29 -06:00
Justin Starry ea2b26e5f5 Fix scp client mint keypair (#6565) 2019-10-25 16:23:52 -07:00
Michael Vines e103789994
Ignore exit code when the first mount fails 2019-10-25 10:11:32 -07:00
Michael Vines 90461245f9
Reduce TdS fees to 1 lamport per sig, and slots_per_epoch/2 (#6542) 2019-10-24 20:37:23 -07:00
Michael Vines 1c91c1e880
Remount /mnt/extra-disk on reboot 2019-10-24 20:14:26 -07:00
Dan Albert dadcb632d8
Specify machine type without necessarily enabling GPU (#6529)
* Specifiy machine type without necessarily enabling GPU

* Make long arg, extend --enable-gpu to automation

* Set machine types only in one place

* Fixup

* Fixup flag in automation

* Typo

* shellcheck
2019-10-24 15:12:25 -06:00
Michael Vines 2de2fbd5e3
Remove stray setup_secondary_mounts 2019-10-24 13:48:57 -07:00
Michael Vines 14eca5aea6
Remove setup_secondary_mount knowledge from multinode-demo/ (#6530) 2019-10-24 13:40:16 -07:00
Justin Starry 7a7abe692e
Add mint keypair to solana clients for convenience (#6536) 2019-10-24 14:31:06 -04:00
Justin Starry 88033bccbb
Add mint keypair to validators for convenience (#6531) 2019-10-24 12:50:32 -04:00
Michael Vines 35d6196384
Surface nvidia-smi errors in CI 2019-10-23 10:59:30 -07:00
Michael Vines 26b8747014
Exit cleanly for idle clients 2019-10-23 09:56:05 -07:00
Michael Vines bedb05bdeb
Plumb GEOLOCATION_API_KEY down to the blockexplorer (#6514) 2019-10-23 09:53:06 -07:00
Justin Starry 6829b8a6fb
Ensure solana commands are added to idle clients (#6513) 2019-10-23 11:15:00 -04:00
Michael Vines e462a7d1d5
net: Add ability to only start/stop client nodes (#6503)
* Add info --eval

* net: Add ability to start idle client nodes
2019-10-22 16:08:49 -07:00
Sagar Dhawan 4c515d0ef1
Sagar: Add ssh keys for colo (#6507) 2019-10-22 15:59:39 -07:00
Michael Vines f80a5b8c34
Remove some TODOs (#6488)
* Remove stale TODOs

* Ban TODO markers from markdown

* Scrub all TODOs from ci/ and book/
2019-10-21 22:25:06 -07:00
Greg Fitzgerald 3b9b9b1500 Rename remaining uses of fullnode to validator (#6476)
automerge
2019-10-21 20:21:21 -07:00
Dan Albert 00809a67c0
Push perf test results to slack app (#6371)
* Add script to publish testnet results to slack

* Obscure webhook URL

* fixup

* Replace read with cat redirection

* Turn back on net restart

* Pick nits

* Make symlink before trying to delete its contents

* Display test config in slack and pick Trents nit not to maybe rm -rf /*

* Clean up results print

* Minor nits

* Turn the test settings back up to 11

* typo

* Shellcheck

* Just a few more fields

* fix payload formatting

* Del clear-config.sh

* Mount secondary

* Add commit SHA link and Grafana time range URL

* Add fancy buttons instead of text URLs

* Tighten up test config display

* Fixup display nits

* chellsheck

* Rebase and fix typo
2019-10-21 20:00:17 -04:00
Michael Vines 3fb70b8d47
Ban XXX, TBD, FIXME comments (#6486) 2019-10-21 16:43:11 -07:00
Trent Nelson 564c14a2c6 net.sh: Ensure external disk link is setup before cleaning config dir (#6481)
automerge
2019-10-21 15:38:58 -07:00
sakridge 6996f45d54 Print machine hostname in log (#6480)
automerge
2019-10-21 14:59:03 -07:00
sakridge b1c2c6009e Exclude net/log in rsync script (#6475)
automerge
2019-10-21 14:06:36 -07:00
Trent Nelson 934f69b660 Colo verbosity (#6473)
automerge
2019-10-21 13:49:12 -07:00
Sunny Gleason 951e1f8b48 feat: grant access to sunny@ (#6471) 2019-10-21 11:17:06 -07:00
Greg Fitzgerald 9232057e95
Rename replicator to archiver (#6464)
* Rename replicator to archiver

* cargo fmt

* Fix grammar
2019-10-21 11:29:37 -06:00
Trent Nelson 0fc3c7eee2 Bump Trent's keys... (#6445)
automerge
2019-10-18 15:42:50 -07:00
Michael Vines 6f58bdfcb1 Remove validator sanity check (#6435)
automerge
2019-10-18 08:26:08 -07:00