solana/net/README.md

126 lines
3.3 KiB
Markdown
Raw Normal View History

# Network Management
This directory contains scripts useful for working with a test network. It's
intended to be both dev and CD friendly.
### User Account Prerequisites
2020-02-27 15:00:54 -08:00
GCP, AWS, colo are supported.
2018-09-16 14:46:08 -07:00
#### GCP
First authenticate with
```bash
$ gcloud auth login
```
2018-09-16 14:46:08 -07:00
#### AWS
Obtain your credentials from the AWS IAM Console and configure the AWS CLI with
```bash
$ aws configure
```
More information on AWS CLI configuration can be found [here](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html#cli-quick-configuration)
### Metrics configuration (Optional)
2018-09-16 14:46:08 -07:00
Ensure that `$(whoami)` is the name of an InfluxDB user account with enough
access to create a new InfluxDB database. Ask mvines@ for help if needed.
## Quick Start
2018-09-16 14:46:08 -07:00
NOTE: This example uses GCE. If you are using AWS EC2, replace `./gce.sh` with
2018-09-16 14:46:08 -07:00
`./ec2.sh` in the commands.
```bash
$ cd net/
$ ./gce.sh create -n 5 -c 1 #<-- Create a GCE testnet with 5 additional nodes (beyond the bootstrap node) and 1 client (billing starts here)
2020-02-27 15:00:54 -08:00
$ ./init-metrics.sh $(whoami) #<-- Recreate a metrics database for the testnet and configure credentials
$ ./net.sh start #<-- Deploy the network from the local workspace and start processes on all nodes including bench-tps on the client node
$ ./ssh.sh #<-- Show a help to ssh into any testnet node to access logs/etc
$ ./net.sh stop #<-- Stop running processes on all nodes
$ ./gce.sh delete #<-- Dispose of the network (billing stops here)
```
2018-09-04 23:01:48 -07:00
## Tips
### Running the network over public IP addresses
By default private IP addresses are used with all instances in the same
availability zone to avoid GCE network engress charges. However to run the
network over public IP addresses:
```bash
2018-09-11 11:29:49 -07:00
$ ./gce.sh create -P ...
2018-09-04 23:01:48 -07:00
```
2018-09-16 14:46:08 -07:00
or
```bash
$ ./ec2.sh create -P ...
```
2018-09-04 23:01:48 -07:00
2019-03-02 17:08:46 -08:00
### Deploying a tarball-based network
To deploy the latest pre-built `edge` channel tarball (ie, latest from the `master`
2018-09-04 23:01:48 -07:00
branch), once the testnet has been created run:
```bash
2019-03-02 17:08:46 -08:00
$ ./net.sh start -t edge
2018-09-04 23:01:48 -07:00
```
### Enabling CUDA
First ensure the network instances are created with GPU enabled:
```bash
$ ./gce.sh create -g ...
```
2018-09-16 14:46:08 -07:00
or
```bash
$ ./ec2.sh create -g ...
```
2018-09-04 23:01:48 -07:00
2019-03-02 17:08:46 -08:00
If deploying a tarball-based network nothing further is required, as GPU presence
2018-09-04 23:01:48 -07:00
is detected at runtime and the CUDA build is auto selected.
### Partition testing
To induce the partition `net.sh netem --config-file <config file path>`
To remove partition `net.sh netem --config-file <config file path> --netem-cmd cleanup`
The partitioning is also removed if you do `net.sh stop` or `restart`.
An example config that produces 3 almost equal partitions:
```
{
"partitions":[
34,
33,
33
],
"interconnects":[
{
"a":0,
"b":1,
"config":"loss 15% delay 25ms"
},
{
"a":1,
"b":0,
"config":"loss 15% delay 25ms"
},
{
"a":0,
"b":2,
"config":"loss 10% delay 15ms"
},
{
"a":2,
"b":0,
"config":"loss 10% delay 15ms"
},
{
"a":2,
"b":1,
"config":"loss 5% delay 5ms"
},
{
"a":1,
"b":2,
"config":"loss 5% delay 5ms"
}
]
}
```