Commit Graph

629 Commits

Author SHA1 Message Date
sakridge 2c63cf3cbd
Add curie pubkey to authorized keys (#8473)
automerge
2020-02-26 10:27:37 -08:00
Dan Albert 28b115497f
Update setup-dc-node-1.sh 2020-02-13 14:30:41 -07:00
Michael Vines c4fd81fc1c The getConfirmedBlock RPC API is now disabled by default
The --enable-rpc-get-confirmed-block flag allows validators to opt-in to
the higher disk usage and IOPS.
2020-02-11 22:24:08 -07:00
Michael Vines d6b3961530
s/mint/faucet 2020-01-31 12:14:53 -07:00
Dan Albert eff876881b
Remove asteroids and pacman from QA/dev testnet availability (#8050)
automerge
2020-01-31 10:26:25 -08:00
Justin Starry 9adf0d4ee0 Don't exit early if add. validators not found during gce.sh config 2020-01-31 08:34:10 -07:00
Dan Albert 2f34f433b3 Specify where VM images are coming from across GCE projects (#7985)
automerge
2020-01-27 08:17:21 -08:00
Michael Vines 989355e885 Add ability to hard fork at any slot (#7801)
automerge
2020-01-24 17:27:04 -08:00
Dan Albert 7587656cf6
Implement automated partition testing (#7222) 2020-01-22 13:46:50 -05:00
Greg Fitzgerald 3aabeb2b81
Rename bootstrap leader (#7906)
* Rename bootstrap leader to bootstrap validator

It's a normal validator as soon as other validators enter the
leader schedule.

* cargo fmt

* Fix build

Thanks @CriesofCarrots!
2020-01-22 09:22:09 -07:00
Michael Vines 356f246a74 Remove get-/show- prefix from cli commands 2020-01-21 08:43:07 -07:00
sakridge b7b68ecdba
Add partition testing documentation (#7739) 2020-01-10 15:32:43 -08:00
Michael Vines 447fe48d2a
Revert "Add a stand-alone gossip node on the blocksteamer instance"
This reverts commit a217920561.

This commit is causing trouble when the TdS cluster is reset and
validators running an older genesis config are still present.
Occasionally an RPC URL from an older validator will be selected,
causing a new node to fail to boot.
2020-01-04 16:42:12 -07:00
Michael Vines a217920561 Add a stand-alone gossip node on the blocksteamer instance
The blocksteamer instance is the TdS cluster entrypoint.  Running an
additional solana-gossip node allows other participants to join a
cluster even if the validator node on the blocksteamer instance goes down.
2020-01-02 17:20:59 -07:00
sakridge 6f7d0c6928
Move cleanup to a script so it doesn't kill itself (#7603) 2019-12-23 14:31:57 -08:00
Justin Starry 9bd5888f5e
Fix broken internal-nodes-stake-lamports arg in scripts (#7581) 2019-12-19 21:38:03 -05:00
Trent Nelson 554188e88e
Colo - Node install scripts missing latest user requests (#7540)
* Enable user GPU profiling while installing CUDA

* Install heaptrack
2019-12-17 19:00:12 -05:00
Dan Albert 8176470b7f
Add pubkey from new buildkite agent instance 2019-12-17 18:00:15 -05:00
Tyera Eulberg 3513f4ee84
Rename drone to faucet (#7508) 2019-12-16 14:05:17 -07:00
Dan Albert 9ac112104c
Searhc across command line for pattern to kill (#7475) 2019-12-13 21:08:41 -05:00
Trent Nelson 42f2b14a74 Colo: Fix lockfile syntax (#7432)
Logical AND for [ is -a, [[ is &&.
2019-12-11 15:32:38 -05:00
Dan Albert 12d471e2da
Update default node balance to 500 SOL and default stake to 1 SOL (#7411) 2019-12-10 17:52:35 -05:00
anatoly yakovenko 96c08cd295 add pubkey for colo (#7409)
automerge
2019-12-10 14:18:57 -08:00
Rob Walker f3633a2e04
rent for testnet (#7407) 2019-12-10 13:51:19 -08:00
Dan Albert f4a089cc26
Allow delay between validator booting and client start (#7297)
* Allow delay between validator booting and client start
2019-12-05 21:03:26 -05:00
Pankaj Garg 9d7a926a8b
Tune UDP rmem/wmem using sys-tuner daemon (#7273) 2019-12-04 15:17:24 -08:00
Pankaj Garg 75d505c431 Don't hardcode username in sys-tuner (#7234)
automerge
2019-12-04 11:39:26 -08:00
Pankaj Garg d357192025
Fix ssh connection error due to too many authentication failures (#7229) 2019-12-03 15:53:12 -08:00
Pankaj Garg 076e384bb5
Tool to tune system parameters like PoH service priority (#7155)
* New daemon to tune system parameters like PoH service priority

* fixes for Linux

* integrate with poh_service

* fixes

* address review comments

* remove `dead_code` directive
2019-12-02 16:46:46 -08:00
Justin Starry eaa3e87eb0 Support passphrases in keygen (#7134)
* Support passphrases in keygen

* remove short

* Update solana_keygen calls
2019-11-25 21:33:15 -07:00
Trent Nelson d8bc828839
Colo: Refactor remote command dispatch for create and delete (#7092)
* Colo: Dump escaping mess in remote script templates

* Colo: Rename script templates so shellcheck can get 'em

* shellcheck and nits

* Brace all of the things

* Consistent heredoc tags

* Use bash built-in square bracketing consistently

* simplify logic
2019-11-25 10:32:17 -07:00
Justin Starry b8cd0a1bc0
Allow secure keypair input for `solana-archiver` and `solana` cli tools (#7106)
* Add seed phrase keypair recover to archiver

* Add seed phrase keypair to cli with ASK keyword

* cli main tweaks
2019-11-23 11:55:43 -05:00
Justin Starry ce8d37984d
Allow secure keypair input for solana-validator cli (#7080)
* Allow secure keypair input for solana-validator cli

* feedback

* Add --skip-mnemonic-validation

* Update --identity to --identity-keypair

* Use struct instead of tuple

* Fix dependencies

* cargo fmt

* Add basic tests

* Use `seed phrase` instead of `mnemonic`

* Update passphrase prompt
2019-11-22 10:20:40 -05:00
Michael Vines ee6b11d36d Remove ability to deploy custom programs (#7070)
automerge
2019-11-20 15:37:42 -08:00
Michael Vines b0271394cd
Clean up --gossip-port argument (#7067)
--gossip-port now specifies exactly that, the gossip port to use.  The
new --gossip-host argument can be used to specify the DNS name/IP
address for gossip if --entrypoint is not supplied (when --entrypoint is
supplied, the gossip address is automatically set to the node's ip
address as observed by the entrypoint)
2019-11-20 15:21:34 -07:00
Justin Starry 95c137158f Fix gce.sh info (#7054)
automerge
2019-11-19 17:49:25 -08:00
Dan Albert 2d7c7b0982
Fix missed rebase on net.sh (#7037) 2019-11-19 10:22:30 -05:00
Pankaj Garg 955aaef2e6
Fixes to net-shaper and net.sh (#7002)
* Fixes to net-shaper and net.sh

* fixes to default filters and cleanup
2019-11-18 11:33:33 -08:00
Dan Albert 6e04a646ba
Gossip entrypoint is now option of spy not solana-gossip (#7006) 2019-11-17 11:36:24 -05:00
Michael Vines 5ab70c4e97
genesis: rename mint account to faucet account and make it optional (#6990) 2019-11-15 14:50:26 -07:00
Dan Albert 946e937549
Create development vs softlaunch environment hooks into net scripts (#6974) 2019-11-15 15:18:45 -05:00
Pankaj Garg d565ec7968
Fixes to net-shaper, and net.sh option to start/stop shaper (#6981)
* Fixes to net-shaper, and net.sh option to start/stop shaper

* fix shellcheck

* more shellchecks
2019-11-15 12:10:48 -08:00
Sagar Dhawan 3ce6248f8c
Add CPU and RAM usage to Metrics (#6968)
* Add CPU usage to Metrics

* Add RAM usage and rename to system-stats

* Shellcheck

* Remove SC exception

* Address review comments
2019-11-14 20:36:34 -08:00
Dan Albert f27c11ccd8
Add Azure testnet to automation (#6911)
* Add Azure testnet to automation
2019-11-14 09:14:53 -05:00
Michael Vines f116cdeed9
Add validator catchup command (#6922) 2019-11-13 15:58:14 -07:00
Sunny Gleason 9246bee12b
feat: default 8gb hard memory limit for redis (#6913) 2019-11-13 11:09:20 -05:00
Dan Albert bb2fa9957a
Increase default AWS instance size to match GCE and Azure (#6773) 2019-11-12 12:27:59 -05:00
Dan Albert bb158a9b48
Add provider specific self destruct timeouts (#6894) 2019-11-12 12:21:24 -05:00
Ryo Onodera b971eeca4b
Add ryoqun to ssh authorized keys (#6860) 2019-11-11 17:12:24 +09:00
sakridge b14e61ff79
Filter any net/log* directory from rsync (#6857) 2019-11-09 13:38:17 -08:00
Michael Vines 68eafb3f30
Ensire config dir exists 2019-11-08 22:18:21 -07:00
Michael Vines 2649f6bdd6
Avoid excessive log/ relinking 2019-11-08 21:57:50 -07:00
Justin Starry 9807f47d4e
Rename genesis block to genesis config (#6816) 2019-11-08 23:56:57 -05:00
Michael Vines 9c00ad9ff2
Remove some low-hanging TODOs (#6839) 2019-11-08 16:41:36 -07:00
Michael Vines 151adab739
earlyoom now works on reboots (#6841) 2019-11-08 16:40:38 -07:00
Justin Starry 807af8670e
Clean up net logs (#6813) 2019-11-08 10:25:17 -05:00
Michael Vines f7b6e777bf
Revert "Clean up net/log symlinks (#6794)" (#6809)
This reverts commit 68353b7e57.
2019-11-07 22:15:45 -07:00
Justin Starry 68353b7e57
Clean up net/log symlinks (#6794) 2019-11-07 23:45:19 -05:00
Sagar Dhawan 20a52f153b Fix iftop not being stopped correctly (#6803)
automerge
2019-11-07 17:03:14 -08:00
Pankaj Garg 09e8124017 Tool to reconfigure netem on testnet (#6781)
automerge
2019-11-07 11:14:33 -08:00
Michael Vines 87ba66b6d0
Add net/ support for reusable identity keypairs (#6783) 2019-11-06 21:14:05 -07:00
Rob Walker a1fe6265fd
use pubkeys in genesis (#6750) 2019-11-06 11:18:25 -08:00
sakridge ec50c20400
Add time in net/logs path (#6701) 2019-11-06 10:43:12 -08:00
Trent Nelson a91bf296d7
Add some addition packages to DC installer scripts (#6755)
* Add 'cmake' to default DC node installer

* Add 'sysstat' to default DC node installer

For 'iostat'

* Add 'perf' to default DC node installer

* Add 'iftop' to default DC node installer
2019-11-06 09:48:45 -07:00
Pankaj Garg 8993b15248 Integrated use of netem with testnet scripts (#6746)
automerge
2019-11-05 15:04:06 -08:00
Michael Vines fba1af6ea9
ledger-tool can now load a ledger snapshot (#6729) 2019-11-04 22:14:55 -07:00
Sagar Dhawan 3133ee2401
Fix limited iftop output and failure to stop iftop (#6723)
* Fix limited iftop output and failure to stop iftop

* Shellcheck

* Ignore shellcheck
2019-11-04 18:12:07 -08:00
Trent Nelson d085c8626f GCE: Add instances self-destruct (#6363)
automerge
2019-11-04 10:30:26 -08:00
Dan Albert 7b6e3a23be
Add new pubkey to auth keys (#6687) 2019-11-01 14:44:10 -06:00
Dan Albert 1cc8956f74
Get Azure provider working again (#6659)
* Wait for node creation before continuing

* Programatically set networking rules

* Add network security group to nodes upon creation

* shellcheck
2019-11-01 14:43:31 -06:00
TristanDebrunner e6c8bfd008
Add --use-move flag to cargo-install-all.sh and net/net.sh (#6670) 2019-11-01 07:53:30 -07:00
Michael Vines f131255066
Add ~/.cargo/bin to PATH (#6641) 2019-10-30 19:41:24 -07:00
Michael Vines 7bb224f54a Install ag on nodes (#6634)
automerge
2019-10-30 16:43:16 -07:00
Tyera Eulberg 4ec95043d7
Update sol:lamport ratio to base-10 (#6611)
* Update sol:lamport ratio

* Update various SOL quantities in bash scripts
2019-10-29 20:03:48 -06:00
Michael Vines d952b38f93
Ensure nofiles is not capped at 1024 on a node reboot 2019-10-28 23:21:34 -07:00
Michael Vines 1e2ab89b47
Ensure redis-server is started on a reboot 2019-10-28 20:58:46 -07:00
Dan Albert 9ee65009cd
Implement allowing validator boot failure into automation (#6589)
* Pass allow boot failures through create AND start

* Extend sleep timeout to all nodes

* Add 100 node testcase

* Reduce consistent sleep
2019-10-28 16:43:40 -06:00
Trent Nelson 96e209db49
Colo: Don't fail without a message (#6558) 2019-10-28 09:20:49 -06:00
Michael Vines 0c14ca58c7 Invoke on-reboot from cloud startup script to avoid racing with cron (#6579)
automerge
2019-10-27 10:56:16 -07:00
Pankaj Garg e174af7838 Use iftop to collect network bandwidth usage (#6560)
* Use iftop to collect network bandwidth usage

* fix shellcheck

* more shellchecks

* review comments
2019-10-26 00:06:46 -07:00
Michael Vines be74801236
Add NET_NUM_xyz variables 2019-10-25 23:00:14 -07:00
Michael Vines e966c96644
Disable sigverify on blockstreamer node
This node get overloaded at high TPS trying to manage both a validator
and the blockexplorer.  Reduce it's workload by turning off sigverify,
which doesn't really matter since this node doesn't even vote
2019-10-25 21:33:08 -07:00
Dan Albert a2a9d54985
Increase node start stagger (#6566) 2019-10-25 17:35:29 -06:00
Justin Starry ea2b26e5f5 Fix scp client mint keypair (#6565) 2019-10-25 16:23:52 -07:00
Michael Vines e103789994
Ignore exit code when the first mount fails 2019-10-25 10:11:32 -07:00
Michael Vines 90461245f9
Reduce TdS fees to 1 lamport per sig, and slots_per_epoch/2 (#6542) 2019-10-24 20:37:23 -07:00
Michael Vines 1c91c1e880
Remount /mnt/extra-disk on reboot 2019-10-24 20:14:26 -07:00
Dan Albert dadcb632d8
Specify machine type without necessarily enabling GPU (#6529)
* Specifiy machine type without necessarily enabling GPU

* Make long arg, extend --enable-gpu to automation

* Set machine types only in one place

* Fixup

* Fixup flag in automation

* Typo

* shellcheck
2019-10-24 15:12:25 -06:00
Michael Vines 2de2fbd5e3
Remove stray setup_secondary_mounts 2019-10-24 13:48:57 -07:00
Michael Vines 14eca5aea6
Remove setup_secondary_mount knowledge from multinode-demo/ (#6530) 2019-10-24 13:40:16 -07:00
Justin Starry 7a7abe692e
Add mint keypair to solana clients for convenience (#6536) 2019-10-24 14:31:06 -04:00
Justin Starry 88033bccbb
Add mint keypair to validators for convenience (#6531) 2019-10-24 12:50:32 -04:00
Michael Vines 35d6196384
Surface nvidia-smi errors in CI 2019-10-23 10:59:30 -07:00
Michael Vines 26b8747014
Exit cleanly for idle clients 2019-10-23 09:56:05 -07:00
Michael Vines bedb05bdeb
Plumb GEOLOCATION_API_KEY down to the blockexplorer (#6514) 2019-10-23 09:53:06 -07:00
Justin Starry 6829b8a6fb
Ensure solana commands are added to idle clients (#6513) 2019-10-23 11:15:00 -04:00
Michael Vines e462a7d1d5
net: Add ability to only start/stop client nodes (#6503)
* Add info --eval

* net: Add ability to start idle client nodes
2019-10-22 16:08:49 -07:00
Sagar Dhawan 4c515d0ef1
Sagar: Add ssh keys for colo (#6507) 2019-10-22 15:59:39 -07:00
Michael Vines f80a5b8c34
Remove some TODOs (#6488)
* Remove stale TODOs

* Ban TODO markers from markdown

* Scrub all TODOs from ci/ and book/
2019-10-21 22:25:06 -07:00
Greg Fitzgerald 3b9b9b1500 Rename remaining uses of fullnode to validator (#6476)
automerge
2019-10-21 20:21:21 -07:00
Dan Albert 00809a67c0
Push perf test results to slack app (#6371)
* Add script to publish testnet results to slack

* Obscure webhook URL

* fixup

* Replace read with cat redirection

* Turn back on net restart

* Pick nits

* Make symlink before trying to delete its contents

* Display test config in slack and pick Trents nit not to maybe rm -rf /*

* Clean up results print

* Minor nits

* Turn the test settings back up to 11

* typo

* Shellcheck

* Just a few more fields

* fix payload formatting

* Del clear-config.sh

* Mount secondary

* Add commit SHA link and Grafana time range URL

* Add fancy buttons instead of text URLs

* Tighten up test config display

* Fixup display nits

* chellsheck

* Rebase and fix typo
2019-10-21 20:00:17 -04:00
Michael Vines 3fb70b8d47
Ban XXX, TBD, FIXME comments (#6486) 2019-10-21 16:43:11 -07:00
Trent Nelson 564c14a2c6 net.sh: Ensure external disk link is setup before cleaning config dir (#6481)
automerge
2019-10-21 15:38:58 -07:00
sakridge 6996f45d54 Print machine hostname in log (#6480)
automerge
2019-10-21 14:59:03 -07:00
sakridge b1c2c6009e Exclude net/log in rsync script (#6475)
automerge
2019-10-21 14:06:36 -07:00
Trent Nelson 934f69b660 Colo verbosity (#6473)
automerge
2019-10-21 13:49:12 -07:00
Sunny Gleason 951e1f8b48 feat: grant access to sunny@ (#6471) 2019-10-21 11:17:06 -07:00
Greg Fitzgerald 9232057e95
Rename replicator to archiver (#6464)
* Rename replicator to archiver

* cargo fmt

* Fix grammar
2019-10-21 11:29:37 -06:00
Trent Nelson 0fc3c7eee2 Bump Trent's keys... (#6445)
automerge
2019-10-18 15:42:50 -07:00
Michael Vines 6f58bdfcb1 Remove validator sanity check (#6435)
automerge
2019-10-18 08:26:08 -07:00
Pankaj Garg 854c62e208 Reduce kernel networking buffer for rmem and wmem (#6422)
automerge
2019-10-17 14:52:24 -07:00
Trent Nelson 1759968c1e Colo: Put NVMe disks to use (#6357)
automerge
2019-10-17 14:44:45 -07:00
Dan Albert b4ed88e0f7
Fail faster on boot up (#6412) 2019-10-17 12:26:12 -04:00
Michael Vines 2d351d3952
Prevent ping stats header from confusing buildkite log folding 2019-10-16 13:36:16 -07:00
Michael Vines 605b477e06
Permit finding more nodes than expected (./gce.sh config) 2019-10-16 13:21:00 -07:00
Michael Vines b7af5f08d6
Avoid more non-standard ping. macOS 💔 2019-10-16 10:35:41 -07:00
Michael Vines 781dfd9dc4
Drop non-standard ping -o option 2019-10-16 10:05:46 -07:00
Michael Vines 9267931ef6 Add support for preemptible GCP instances 2019-10-16 08:10:31 -07:00
Michael Vines 37a29b979f
--force 2019-10-15 15:12:25 -07:00
Michael Vines d89174ee82
Default to no client nodes to avoid unnecesary cost 2019-10-15 14:37:52 -07:00
Michael Vines 8bc9d8988f
- 2019-10-15 07:58:40 -07:00
Michael Vines f7279804b4
Ensure solana-cli has a keypair 2019-10-15 07:47:45 -07:00
Michael Vines 169b772398 Show validators during net sanity 2019-10-14 20:38:51 -07:00
Trent Nelson b75438ff32 gce.sh: Unwind allocation upon failure (#6343)
automerge
2019-10-14 09:36:20 -07:00
Trent Nelson 82fea9ce73 net.sh: Add support for selecting validator GPU mode (#6326)
automerge
2019-10-14 09:33:32 -07:00
Greg Fitzgerald 322fcea6e5
More fullnode to validator renaming (#6337) 2019-10-11 13:30:52 -06:00
Trent Nelson fa64a0b367 gce.sh: Be strict about fullnode count w/o --allow-boot-failures (#6321)
automerge
2019-10-10 17:13:59 -07:00
Trent Nelson 81fb9e6a59 gce.sh: Rename -f flag to better reflect usage (#6318)
automerge
2019-10-10 12:57:03 -07:00
Trent Nelson 4713cb8675 Colo: Prefer public IPs, part 2 (#6297)
automerge
2019-10-09 15:17:24 -07:00
Trent Nelson fdaee4ab17
Colo: Add running process cleanup to delete logic (#6281) 2019-10-09 15:49:33 -06:00
Justin Starry 95d15dc720
Add jstarry to authorized keys (#6293) 2019-10-09 15:04:44 -04:00
Trent Nelson 667f9e0d79 Colo: Factor out inlined scripts to own files (#6266)
automerge
2019-10-07 22:05:36 -07:00
Trent Nelson 57916f8be6 Colo: Prefer public IPs (#6264)
automerge
2019-10-07 20:44:57 -07:00
Michael Vines 18653b825b
Preserve previous fullnode log file on restart 2019-10-04 07:58:33 -07:00
Pankaj Garg a05d772aa9
Add colo access pubkey (#6232)
* Add colo access pubkey

* Change the key to ed25519
2019-10-03 19:55:39 -07:00
Dan Albert 58139ce5ae
Add buildkite-agent key for colo access (#6205) 2019-10-01 13:24:04 -07:00
Michael Vines 8e888059d8
Use built-in solana-gossip timeout for better error messages (#6189) 2019-10-01 12:30:11 -07:00
Dan Albert db18611c86
Add ability to manually create a db (#6151) 2019-09-27 12:03:20 -07:00
sakridge f97d33e3a7
Add sakridge pubkey (#6142) 2019-09-27 10:55:38 -07:00
sakridge 06b445ac07
Skip if --custom-cpu is used as well. (#6130) 2019-09-26 15:52:03 -07:00
Michael Vines b4da83a3ab
Remove CUDA feature (#6094) 2019-09-26 13:36:51 -07:00
Trent Nelson c4ed80d544 colo-utils: Disable StrictHostKeyChecking for SSH calls (#6117)
automerge
2019-09-26 11:22:07 -07:00
Dan Albert 93ad637c5c
typo 2019-09-25 16:58:53 -04:00
Trent Nelson 02647c25a9 net: Add Trent's work laptop pubkey (#6022)
automerge
2019-09-23 10:25:36 -07:00
Michael Vines 4c49566a89
Enable nvidia persistence mode on instance reboots 2019-09-21 10:45:20 -07:00
Michael Vines 8bbc8343ff
Place verison.yml in the right location 2019-09-19 22:41:27 -07:00
Trent Nelson 2636a9c9f1 Add script for managing colo resourse ala gce.sh (#5854)
automerge
2019-09-19 14:08:22 -07:00
Trent Nelson 4c54245969 net/gce.sh: Sync cloud_CreateInstances docs and usage (#5982)
automerge
2019-09-19 13:28:25 -07:00
Sunny Gleason 51b3451e20 feat: use redis version 5+ via ppa:chris-lea (#5981) 2019-09-19 12:04:06 -07:00
Michael Vines fee5c6c057
testnet-edge/testnet-beta now update while preserving the ledger (#5979)
* Check if an update is current before deploying it again

* Add (new) update command to deploy testnet updates

* Add --deploy-if-newer flag to permit conditional net updates
2019-09-19 12:03:47 -07:00