Pankaj Garg
e174af7838
Use iftop to collect network bandwidth usage ( #6560 )
...
* Use iftop to collect network bandwidth usage
* fix shellcheck
* more shellchecks
* review comments
2019-10-26 00:06:46 -07:00
Michael Vines
e103789994
Ignore exit code when the first mount fails
2019-10-25 10:11:32 -07:00
Michael Vines
1c91c1e880
Remount /mnt/extra-disk on reboot
2019-10-24 20:14:26 -07:00
Michael Vines
35d6196384
Surface nvidia-smi errors in CI
2019-10-23 10:59:30 -07:00
Sagar Dhawan
4c515d0ef1
Sagar: Add ssh keys for colo ( #6507 )
2019-10-22 15:59:39 -07:00
Michael Vines
f80a5b8c34
Remove some TODOs ( #6488 )
...
* Remove stale TODOs
* Ban TODO markers from markdown
* Scrub all TODOs from ci/ and book/
2019-10-21 22:25:06 -07:00
Greg Fitzgerald
3b9b9b1500
Rename remaining uses of fullnode to validator ( #6476 )
...
automerge
2019-10-21 20:21:21 -07:00
Michael Vines
3fb70b8d47
Ban XXX, TBD, FIXME comments ( #6486 )
2019-10-21 16:43:11 -07:00
Trent Nelson
934f69b660
Colo verbosity ( #6473 )
...
automerge
2019-10-21 13:49:12 -07:00
Sunny Gleason
951e1f8b48
feat: grant access to sunny@ ( #6471 )
2019-10-21 11:17:06 -07:00
Trent Nelson
0fc3c7eee2
Bump Trent's keys... ( #6445 )
...
automerge
2019-10-18 15:42:50 -07:00
Pankaj Garg
854c62e208
Reduce kernel networking buffer for rmem and wmem ( #6422 )
...
automerge
2019-10-17 14:52:24 -07:00
Trent Nelson
1759968c1e
Colo: Put NVMe disks to use ( #6357 )
...
automerge
2019-10-17 14:44:45 -07:00
Michael Vines
9267931ef6
Add support for preemptible GCP instances
2019-10-16 08:10:31 -07:00
Greg Fitzgerald
322fcea6e5
More fullnode to validator renaming ( #6337 )
2019-10-11 13:30:52 -06:00
Trent Nelson
4713cb8675
Colo: Prefer public IPs, part 2 ( #6297 )
...
automerge
2019-10-09 15:17:24 -07:00
Trent Nelson
fdaee4ab17
Colo: Add running process cleanup to delete logic ( #6281 )
2019-10-09 15:49:33 -06:00
Justin Starry
95d15dc720
Add jstarry to authorized keys ( #6293 )
2019-10-09 15:04:44 -04:00
Trent Nelson
667f9e0d79
Colo: Factor out inlined scripts to own files ( #6266 )
...
automerge
2019-10-07 22:05:36 -07:00
Trent Nelson
57916f8be6
Colo: Prefer public IPs ( #6264 )
...
automerge
2019-10-07 20:44:57 -07:00
Pankaj Garg
a05d772aa9
Add colo access pubkey ( #6232 )
...
* Add colo access pubkey
* Change the key to ed25519
2019-10-03 19:55:39 -07:00
Dan Albert
58139ce5ae
Add buildkite-agent key for colo access ( #6205 )
2019-10-01 13:24:04 -07:00
sakridge
f97d33e3a7
Add sakridge pubkey ( #6142 )
2019-09-27 10:55:38 -07:00
Trent Nelson
c4ed80d544
colo-utils: Disable StrictHostKeyChecking for SSH calls ( #6117 )
...
automerge
2019-09-26 11:22:07 -07:00
Dan Albert
93ad637c5c
typo
2019-09-25 16:58:53 -04:00
Trent Nelson
02647c25a9
net: Add Trent's work laptop pubkey ( #6022 )
...
automerge
2019-09-23 10:25:36 -07:00
Trent Nelson
2636a9c9f1
Add script for managing colo resourse ala gce.sh ( #5854 )
...
automerge
2019-09-19 14:08:22 -07:00
Trent Nelson
4c54245969
net/gce.sh: Sync cloud_CreateInstances docs and usage ( #5982 )
...
automerge
2019-09-19 13:28:25 -07:00
Sunny Gleason
51b3451e20
feat: use redis version 5+ via ppa:chris-lea ( #5981 )
2019-09-19 12:04:06 -07:00
Dan Albert
742562fc2e
Set maintenance policy to terminate and restart for GCE ( #5935 )
2019-09-18 10:38:38 -07:00
Michael Vines
92a5979558
net/config/ is now shellcheck compliant ( #5888 )
...
automerge
2019-09-12 16:11:13 -07:00
Michael Vines
fc4aa71193
GCE-based nodes now reboot on maintenance events instead of terminating ( #5861 )
2019-09-10 12:30:06 -07:00
Trent Nelson
8362b408d9
Move testnet ssh key ( #5770 )
...
* Factor out hardcoded testnet ssh key path
* Build/create test net ssh key path
* Rename testnet ssh dir
* Give testnetSSHDir a more generic name
* shellcheck
* favor hardcoded paths over `paths.sh`
* Put instance-startup-complete stamp in the scratch dir as well
* Rename `/solana` > `/solana-scratch`
2019-09-03 18:51:16 -06:00
Trent Nelson
36fcb4fbca
Add trent's workstation pubkey to authorized keys script ( #5748 )
...
automerge
2019-08-30 10:13:55 -07:00
Michael Vines
33e7e23484
Update ubuntu image
2019-08-29 14:40:08 -07:00
Michael Vines
1363841f32
Fix testnet deployment
2019-08-15 08:32:10 -07:00
TristanDebrunner
79416381dc
Add pubkey setup for datacenter nodes ( #5514 )
2019-08-14 14:25:56 -06:00
Michael Vines
6085109171
Delete terminated GCP instances ( #5490 )
...
automerge
2019-08-12 08:28:58 -07:00
Michael Vines
bd7e269280
Kill rsync ( #5336 )
...
automerge
2019-07-30 22:43:47 -07:00
Dan Albert
21cef2fe21
Do not attempt to create solana user multiple times ( #5228 )
...
* Do not attempt to create solana user multiple times
2019-07-22 16:13:08 -06:00
Jack May
4a02914b30
Add pub key authorized list
2019-07-12 12:34:17 -07:00
Dan Albert
f093377805
apt-get update before installing certbot ( #5054 )
...
* apt-get update before installing certbot
2019-07-12 11:50:40 -06:00
Dan Albert
e4861f52e0
Add support for additional disks for config-local ( #5030 )
...
* Add support for additional disks for config-local
* Restore wrongly deleted lines
* Shellcheck
* add args in the right place dummy
* Fix nits
* typo
* var naming cleanup
* Add stub function for remaining cloud providers
2019-07-11 16:23:32 -06:00
Michael Vines
0a949677f0
net/ plumbing to manage LetsEncrypt TLS certificates ( #4985 )
...
automerge
2019-07-09 15:45:46 -07:00
carllin
1033f52877
Add pubkey ( #4971 )
2019-07-09 00:54:22 -07:00
Sathish
96b56fa6f7
Update authorized public key ( #4783 )
2019-06-22 08:33:39 -07:00
Michael Vines
bd884a56bf
Install libssl1.1 better
2019-06-14 08:01:22 -07:00
carllin
73491e3ca1
bump libssl ( #4634 )
2019-06-10 18:03:13 -07:00
Michael Vines
471465a5f4
net/: Add solana-install test to sanity ( #4438 )
...
* Add instance creation date to motd
* Setup localtime
* Add solana-install test
2019-05-26 11:17:07 -07:00
Michael Vines
458ae3fdac
Switch to instances with AVX-512 if possible for better interop with dev machines ( #4328 )
...
automerge
2019-05-17 20:06:07 -07:00
Michael Vines
50f79e495e
net/ improvements ( #4257 )
...
automerge
2019-05-11 22:54:50 -07:00
Pankaj Garg
5719b8f251
Change remote node's ssh config to allow more login retries ( #4215 )
...
automerge
2019-05-08 11:20:06 -07:00
Michael Vines
950d8494ba
earlyoom: Stop using unsupported -k option ( #4096 )
...
automerge
2019-05-01 11:29:02 -07:00
Michael Vines
d21fa4a177
v0.14: various net/ fixes for large clusters ( #4080 )
...
* net.sh: Add -F to discard validator nodes that didn't bootup successfully
* Relax sanity node count when validator bootup failure is permitted
* Less sanity for testnet-demo
* net.sh: Add -F to discard validator nodes that didn't bootup successfully
2019-04-29 21:38:32 -07:00
Michael Vines
6f56501034
Correctly terminate instances across multiple zones
2019-04-28 09:09:02 -07:00
Dan Albert
d12705f9b0
Remove wait loops in non-GPU instance creation and add SSD option as default disk type ( #3992 )
2019-04-25 13:43:42 -06:00
Pankaj Garg
e867ce0944
Find unique zones and delete nodes in each zone ( #3978 )
2019-04-24 17:50:42 -07:00
Dan Albert
4e7e5ace9d
Add support for Azure instances in testnet creation ( #3905 )
...
* Add support for Azure instances in testnet creation
* Fixup
* Fix shellcheck errors
* More shellcheck and cleanup node creation and deletion
* More shellcheck and cleanup node creation and deletion
* Fixup instance wait API
* Fix revieew comments and add GPU installation extension
2019-04-23 16:41:45 -06:00
Pankaj Garg
d83a71d89f
More AWS regions for testnet deployment ( #3911 )
...
- also some minor fixes to gce.sh
2019-04-19 17:46:14 -07:00
Pankaj Garg
8999bfef65
Try to delete nodes in all cloud zones ( #3874 )
2019-04-18 13:16:14 -07:00
sakridge
684e1c73dd
Allow for custom cpu config on gce and use 20gb ram for clients ( #3856 )
2019-04-18 09:36:11 -07:00
Pankaj Garg
9cd555cad5
AWS script change for additional zones and regions
2019-04-04 15:59:59 -07:00
Pankaj Garg
15b945a652
Fix EC2 scripts for blockstream startup
2019-03-28 15:37:23 -07:00
Pankaj Garg
ed48c495a3
fix shell-check errors
2019-03-27 18:05:17 -07:00
Pankaj Garg
f0abd06a46
Added support for multi-region cloud testnet
2019-03-27 18:05:17 -07:00
Michael Vines
7498488f5f
cloud_DeleteInstances() now waits for the instances to be terminated
2019-03-14 21:15:00 -07:00
Michael Vines
5d27f221f7
Drop socat for iptables
2019-03-13 12:03:56 -05:00
Rob Walker
a799f8f4b1
tell blockexplorer to run on port 8080 ( #3237 )
...
* tell blockexplorer to run on port 8080
* forward port 80 to 5000 for a blockexplorer node
2019-03-12 13:39:09 -07:00
Michael Vines
a444cac2aa
Switch to upstream AMIs for non-CUDA EC2 testnets
2019-02-18 18:59:56 -08:00
Michael Vines
1e714eb6b2
Generate ec2 security group programmatically
2019-02-18 18:59:56 -08:00
Michael Vines
9eb8b67b5c
Install blockexplorer dependencies
2019-02-15 20:17:46 -08:00
Michael Vines
f5bbc5e961
Fix args
2018-12-23 20:56:13 -08:00
Michael Vines
753a783ba9
Add solana user to adm group for /var/log/syslog access
2018-12-23 17:28:35 -08:00
Pankaj Garg
41f8764232
Ignore error while enabling nvidia persistence mode ( #2265 )
2018-12-21 12:37:51 -08:00
Pankaj Garg
4bf797c8f1
Load nvidia drivers on node startup ( #2263 )
...
* Load nvidia drivers on node startup
* added new script to enable nvidia driver persistent mode
* remove set -ex
2018-12-21 11:43:52 -08:00
Michael Vines
b34e197424
Add newline at end of file
2018-12-06 17:46:46 -08:00
carllin
aecb06cd2a
Update versions in install-libssl-compatibility.sh ( #2044 )
2018-12-06 15:57:30 -08:00
Michael Vines
c5b1bc1128
Remove obsolete update-default-cuda.sh
2018-11-12 20:31:16 -08:00
Michael Vines
3466f139a4
set -e shuffling
2018-11-11 16:24:36 -08:00
Michael Vines
def7d156f6
codemod --extensions sh '#!/usr/bin/env bash -e' '#!/usr/bin/env bash\nset -e'
2018-11-11 16:24:36 -08:00
Michael Vines
33aab094ef
codemod --extensions sh '#!/bin/bash' '#!/usr/bin/env bash'
2018-11-11 16:24:36 -08:00
Michael Vines
49014393e1
Be less fancy for bash 4.4 compat
2018-11-10 18:05:55 -08:00
Michael Vines
818d03c835
Bump earlyoom version
2018-11-10 15:56:17 -08:00
Michael Vines
51ed48941b
Continue if docker0 is not present
2018-11-07 19:33:20 -08:00
Michael Vines
f8f11b7f50
Remove docker0 interface if present
2018-11-07 18:23:24 -08:00
Michael Vines
b02b636b36
Support local tarball deploys
2018-11-07 14:44:40 -08:00
Michael Vines
cd18a1b7db
t
2018-11-06 14:08:47 -08:00
Michael Vines
eae9372a5d
Upgrade GCP CPU-based testnet to 18.04
2018-11-04 19:18:47 -08:00
Pankaj Garg
3cc78d3a41
Added a new remote node configuration script to set rmem/wmem ( #1647 )
...
* Added a new remote node configuration script to set rmem/wmem
* Update common.sh for rmem/wmem configuration
2018-10-30 09:17:35 -07:00
Pankaj Garg
dfde83bdce
Wildcard early OOM deb package revision ( #1554 )
2018-10-19 14:17:19 -07:00
Michael Vines
b1e941cab9
Return all instances
2018-10-01 07:51:48 -07:00
sakridge
3199f174a3
Add option to pass boot disk type to gce create ( #1308 )
2018-09-22 16:43:47 -07:00
Michael Vines
155ee8792f
Add GPU support to ec2-provider
2018-09-17 09:26:25 -07:00
Michael Vines
f89f121d2b
Add AWS EC2 support
2018-09-17 09:26:25 -07:00
Michael Vines
ee74b367ce
Add docker install script
2018-09-12 17:09:37 -07:00
Michael Vines
ebcac3c2d1
Use a common solana user on all testnet instances
2018-09-08 22:34:26 -07:00
Michael Vines
1d6c4aacae
Retry rsync a couple times before failing
2018-09-08 13:59:45 -07:00
Michael Vines
9f5c86e60c
Install earlyoom at gce instance startup
2018-09-08 13:59:45 -07:00
Michael Vines
9f413fd656
Establish net/scripts/... for better scoping
2018-09-08 13:59:45 -07:00