open source v2
This commit is contained in:
parent
dc7673ba74
commit
35a7a33346
|
@ -127,3 +127,8 @@ dmypy.json
|
|||
|
||||
# Pyre type checker
|
||||
.pyre/
|
||||
|
||||
deploy/id_rsa
|
||||
*.retry
|
||||
*.key
|
||||
.idea/
|
||||
|
|
56
README.md
56
README.md
|
@ -1 +1,55 @@
|
|||
# validators
|
||||
# Validators
|
||||
## Motivation
|
||||
This repository is meant to serve as an example for how to run a solana validator.
|
||||
It does not give specifics on the architecture of Solana, and should not be used as a substitute for Solana's documentation.
|
||||
It is highly recommended to read [Solana's Documentation](https://docs.solana.com/running-validator) about running a validator.
|
||||
This repository should be used in conjunction with Solana's guide. It provides practical
|
||||
real-world examples of cluster setup, and should act as a starting point for participating
|
||||
in mainnet validation.
|
||||
|
||||
This repository gives two examples of potential validator setups. The first is a
|
||||
single node validator that can be used as an entry point for querying on-chain Solana data, or
|
||||
validating transactions.
|
||||
The second is a cluster of Solana validators that are load balanced by an nginx server. Nginx
|
||||
has an active health check feature offered in their premium version. A configuration
|
||||
for active health checks is also included.
|
||||
|
||||
The end goal of this guide is to have a solana validator cluster running in a cloud
|
||||
environment.
|
||||
|
||||
## Overview of setups
|
||||
- run a single validator
|
||||
- run a cluster of validators
|
||||
## Running a single validator
|
||||
#### Instance configuration
|
||||
##### Choosing an instance type
|
||||
Solana's documentation recommends choosing a node type with the highest number of cores possible ([see here](https://docs.solana.com/running-validator/validator-reqs)).
|
||||
Additionally the Solana mainnet utilizes GPUs to increase network throughput. Solana's documentation
|
||||
recommends using Nvidia Turing or Volta family GPUs which are available through most cloud providers.
|
||||
|
||||
This guide was tested using [Amazon AWS g4dn.16xlarge instances](https://aws.amazon.com/ec2/instance-types/g4/) using the
|
||||
Ubuntu 18.04 Deep Learning AMI. These instances provide Nvidia T4 GPUs with a balance of high network
|
||||
throughput and CPU resources.
|
||||
|
||||
##### Instance network configuration
|
||||
After provisioning an instance it is important to configure network whitelists to be compatible
|
||||
with a validator's network usage. Solana nodes communicate via a gossip protocol. This protocol takes
|
||||
place over a port range specified upon validator startup. For this guide we will set that port range to
|
||||
8000-8012. Be sure to whitelist network traffic on whichever port range you choose.
|
||||
|
||||
Validator RPC servers also bind to configurable ports. This guide will set RPC servers to use port 8899
|
||||
for standard HTTP requests and 8900 for websocket connections.
|
||||
|
||||
#### Setting up a single validator
|
||||
Once an instance has been deployed and is accessible over SSH, we can use ansible to run some basic setup
|
||||
scripts. Ansible works by inspecting the contents of a `hosts.yaml` file, which defines the inventory of servers to which one can deploy.
|
||||
To make our servers accesible to ansible, add your server's network location to the validators block in `deploy/hosts.yaml`.
|
||||
This will indicate that the specified server is part of the `validators` group, which will contain our validator machines.
|
||||
`deploy/setup.yaml` contains a set of common setup steps for configuring a server from the base OS image. You can run these
|
||||
setup steps using
|
||||
```
|
||||
# run this from the /deploy directory
|
||||
ansible-playbook -i hosts.yaml -l validators setup.yaml
|
||||
```
|
||||
|
||||
## Running a cluster of validators
|
||||
|
|
|
@ -0,0 +1,7 @@
|
|||
[defaults]
|
||||
inventory = ./hosts.yaml
|
||||
forks = 100
|
||||
interpreter_python = auto
|
||||
|
||||
[ssh_connection]
|
||||
pipelining = True
|
|
@ -0,0 +1,5 @@
|
|||
#!/bin/bash -e
|
||||
|
||||
pssh=$(which parallel-ssh || which pssh)
|
||||
|
||||
$pssh -h <(ansible all --list-hosts -i hosts.yaml | tail -n+2) -i -l ubuntu -- 'curl http://localhost:8899/health'
|
|
@ -0,0 +1,5 @@
|
|||
# Increase cache size
|
||||
cache-size=4096
|
||||
|
||||
# Cache negative replies even if they do not have TTLs
|
||||
neg-ttl=10
|
|
@ -0,0 +1,82 @@
|
|||
user www-data;
|
||||
worker_processes auto;
|
||||
pid /run/nginx.pid;
|
||||
include /etc/nginx/modules-enabled/*.conf;
|
||||
worker_rlimit_nofile 80000;
|
||||
|
||||
events {
|
||||
worker_connections 50000;
|
||||
# multi_accept on;
|
||||
}
|
||||
|
||||
http {
|
||||
|
||||
##
|
||||
# Basic Settings
|
||||
##
|
||||
|
||||
sendfile on;
|
||||
tcp_nopush on;
|
||||
tcp_nodelay on;
|
||||
keepalive_timeout 65;
|
||||
types_hash_max_size 2048;
|
||||
# server_tokens off;
|
||||
|
||||
server_names_hash_bucket_size 128;
|
||||
# server_name_in_redirect off;
|
||||
|
||||
include /etc/nginx/mime.types;
|
||||
default_type application/octet-stream;
|
||||
|
||||
client_max_body_size 100m;
|
||||
|
||||
proxy_busy_buffers_size 32k;
|
||||
proxy_buffers 128 8k;
|
||||
|
||||
##
|
||||
# SSL Settings
|
||||
##
|
||||
|
||||
ssl_protocols TLSv1.2 TLSv1.3;
|
||||
ssl_prefer_server_ciphers on;
|
||||
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384;
|
||||
ssl_session_timeout 100m;
|
||||
ssl_session_cache shared:SSL:10m;
|
||||
ssl_session_tickets off;
|
||||
ssl_dhparam /etc/nginx/ssl/dhparam.pem;
|
||||
|
||||
##
|
||||
# Logging Settings
|
||||
##
|
||||
|
||||
log_format main_ext '$remote_addr - $remote_user [$time_local] "$request" '
|
||||
'$status $body_bytes_sent "$http_referer" '
|
||||
'"$http_user_agent" "$host" sn="$server_name" '
|
||||
'rt=$request_time '
|
||||
'ua="$upstream_addr" us="$upstream_status" '
|
||||
'ut="$upstream_response_time" ul="$upstream_response_length" '
|
||||
'cs=$upstream_cache_status '
|
||||
'msec=$msec '
|
||||
'aid="$upstream_http_account_id"';
|
||||
access_log /var/log/nginx/access.log main_ext;
|
||||
error_log /var/log/nginx/error.log warn;
|
||||
|
||||
##
|
||||
# Gzip Settings
|
||||
##
|
||||
|
||||
gzip on;
|
||||
gzip_vary on;
|
||||
gzip_proxied any;
|
||||
gzip_comp_level 6;
|
||||
gzip_buffers 16 8k;
|
||||
gzip_http_version 1.1;
|
||||
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
|
||||
|
||||
##
|
||||
# Virtual Host Configs
|
||||
##
|
||||
|
||||
include /etc/nginx/conf.d/*.conf;
|
||||
include /etc/nginx/sites-enabled/*;
|
||||
}
|
|
@ -0,0 +1,9 @@
|
|||
server {
|
||||
listen 127.0.0.1:81;
|
||||
server_name 127.0.0.1;
|
||||
location /nginx_status {
|
||||
stub_status on;
|
||||
allow 127.0.0.1;
|
||||
deny all;
|
||||
}
|
||||
}
|
|
@ -0,0 +1,5 @@
|
|||
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload";
|
||||
add_header X-Frame-Options sameorigin;
|
||||
add_header X-Content-Type-Options nosniff;
|
||||
add_header X-XSS-Protection "1; mode=block";
|
||||
add_header Content-Security-Policy block-all-mixed-content;
|
|
@ -0,0 +1,13 @@
|
|||
-----BEGIN DH PARAMETERS-----
|
||||
MIICCAKCAgEAwSMA4EB8xhOeTzV+UMAG7fGVvE75S7WqUMG83YC0hXuefpNY0w5b
|
||||
wQkM5ffkNiIa/lv2W+SqR2WRoh0M6xI0HdUdVKVkNYyWqBKRW4fjh+hbYMar8FCM
|
||||
TFibDHoNU+40Z9bwKWWeURZAQj9yCA0dbXCkv7nIuVrWTHBMHtNt9quMvqevZPoU
|
||||
wL6N004E9pjlEogH4PX/H+o08xGicNtlJXsU0rd2Xev9URo/8IU92qocBjUiUvow
|
||||
yRUJaufmqfT5IV+ezLUCV1yC2UOj0BA3sNVdFNS8MUIIJWWUfLspXHE0iQjNJuW6
|
||||
HOmj9sMwVWjnuRjpMza6wNi+CAaKgzI8YrfABd/PtRl9bxztGRXTaLK+ecRlUbq3
|
||||
l++SLu3mX7GfoACxHhAxQAoaDsZZMgqvsI23DP5FHCCMSQGw6r/dJuZ4q4b8qjWX
|
||||
u6eOY+ZBg4FIYiMsHcgNcNPGKoLf/YQ3L3EAl9iRb2dXPza5QW9pLzoGLRC94EIT
|
||||
Wq2hthOqJPsiEihc2gBaV5sdcbO+tqf4XhtbWLKMVDt91TSYzukdrlE5rnFpmvr5
|
||||
0ze5saNI1tsAgpL8UmJkjpT19VUF6eTv7wpc2gAklel+kUTlJ1rjwja2uq+zNDI5
|
||||
dzt6iXs1SHgY6wkn9orNPAmWFRoKkaLJgmWFeJHIqp14opS4ZESaSiMCAQI=
|
||||
-----END DH PARAMETERS-----
|
|
@ -0,0 +1,12 @@
|
|||
server {
|
||||
listen 30000;
|
||||
location /api {
|
||||
api;
|
||||
}
|
||||
location = /dashboard.html {
|
||||
root /usr/share/nginx/html;
|
||||
}
|
||||
# location /swagger-ui {
|
||||
# root /usr/share/nginx/html;
|
||||
# }
|
||||
}
|
|
@ -0,0 +1,73 @@
|
|||
upstream validator_backend {
|
||||
zone validator_backend 512k;
|
||||
least_conn;
|
||||
keepalive 8192;
|
||||
server validator-1.test.net:8899 max_fails=20 fail_timeout=2;
|
||||
server validator-2.test.net:8899 max_fails=20 fail_timeout=2;
|
||||
}
|
||||
|
||||
upstream validator_ws_backend {
|
||||
zone validator_ws_backend 512k;
|
||||
least_conn;
|
||||
server validator-1.test.net:8899 max_fails=20 fail_timeout=2;
|
||||
server validator-2.test.net:8899 max_fails=20 fail_timeout=2;
|
||||
}
|
||||
|
||||
server {
|
||||
listen 80;
|
||||
server_name validator-lb.test.net;
|
||||
status_zone http_status_zone;
|
||||
|
||||
location / {
|
||||
try_files /nonexistent @$http_upgrade;
|
||||
}
|
||||
|
||||
location @websocket {
|
||||
proxy_pass http://validator_ws_backend;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection "upgrade";
|
||||
health_check uri=/health port=9090;
|
||||
}
|
||||
|
||||
location @ {
|
||||
proxy_pass http://validator_backend;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Connection "";
|
||||
proxy_next_upstream error timeout non_idempotent;
|
||||
proxy_next_upstream_timeout 5;
|
||||
proxy_next_upstream_tries 5;
|
||||
health_check uri=/health port=9090;
|
||||
}
|
||||
}
|
||||
|
||||
server {
|
||||
listen 443;
|
||||
server_name validator-lb.test.net;
|
||||
status_zone https_status_zone;
|
||||
|
||||
ssl on;
|
||||
ssl_certificate /etc/ssl/certs/test.net.pem;
|
||||
ssl_certificate_key /etc/ssl/private/test.net.key;
|
||||
ssl_client_certificate /etc/ssl/certs/cloudflare.pem;
|
||||
ssl_verify_client on;
|
||||
|
||||
location / {
|
||||
try_files /nonexistent @$http_upgrade;
|
||||
}
|
||||
|
||||
location @websocket {
|
||||
proxy_pass http://validator_ws_backend;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection "upgrade";
|
||||
health_check uri=/health port=9090;
|
||||
}
|
||||
|
||||
location @ {
|
||||
proxy_pass http://validator_backend;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Connection "";
|
||||
health_check uri=/health port=9090;
|
||||
}
|
||||
}
|
|
@ -0,0 +1,65 @@
|
|||
upstream validator_backend {
|
||||
least_conn;
|
||||
keepalive 8192;
|
||||
server validator-1.test.net:8899 max_fails=20 fail_timeout=2;
|
||||
server validator-2.test.net:8899 max_fails=20 fail_timeout=2;
|
||||
}
|
||||
|
||||
upstream validator_ws_backend {
|
||||
least_conn;
|
||||
server validator-1.test.net:8899 max_fails=20 fail_timeout=2;
|
||||
server validator-2.test.net:8899 max_fails=20 fail_timeout=2;
|
||||
}
|
||||
|
||||
server {
|
||||
listen 80;
|
||||
server_name validator-lb.test.net;
|
||||
|
||||
location / {
|
||||
try_files /nonexistent @$http_upgrade;
|
||||
}
|
||||
|
||||
location @websocket {
|
||||
proxy_pass http://validator_ws_backend/$1;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection "upgrade";
|
||||
}
|
||||
|
||||
location @ {
|
||||
proxy_pass http://validator_backend/$1;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Connection "";
|
||||
proxy_next_upstream error timeout non_idempotent;
|
||||
proxy_next_upstream_timeout 5;
|
||||
proxy_next_upstream_tries 5;
|
||||
}
|
||||
}
|
||||
|
||||
server {
|
||||
listen 443;
|
||||
server_name validator-lb.test.net;
|
||||
|
||||
ssl on;
|
||||
ssl_certificate /etc/ssl/certs/test.net.pem;
|
||||
ssl_certificate_key /etc/ssl/private/test.net.key;
|
||||
ssl_client_certificate /etc/ssl/certs/cloudflare.pem;
|
||||
ssl_verify_client on;
|
||||
|
||||
location / {
|
||||
try_files /nonexistent @$http_upgrade;
|
||||
}
|
||||
|
||||
location @websocket {
|
||||
proxy_pass http://validator_ws_backend/$1;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection "upgrade";
|
||||
}
|
||||
|
||||
location @ {
|
||||
proxy_pass http://validator_backend/$1;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Connection "";
|
||||
}
|
||||
}
|
|
@ -0,0 +1 @@
|
|||
# insert ssl certificate here
|
|
@ -0,0 +1 @@
|
|||
# insert ssl private key here
|
|
@ -0,0 +1,30 @@
|
|||
upstream validator_backend {
|
||||
keepalive 8192;
|
||||
server localhost:8899 max_fails=20 fail_timeout=2;
|
||||
}
|
||||
|
||||
upstream validator_ws_backend {
|
||||
least_conn;
|
||||
server localhost:8900 fail_timeout=2;
|
||||
}
|
||||
|
||||
server {
|
||||
listen 80;
|
||||
|
||||
location / {
|
||||
try_files /nonexistent @$http_upgrade;
|
||||
}
|
||||
|
||||
location @websocket {
|
||||
proxy_pass http://validator_ws_backend/$1;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection "upgrade";
|
||||
}
|
||||
|
||||
location @ {
|
||||
proxy_pass http://validator_backend/$1;
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Connection "";
|
||||
}
|
||||
}
|
|
@ -0,0 +1,3 @@
|
|||
# Increase memory mapped files limit
|
||||
# https://docs.solana.com/running-validator/validator-start#manual
|
||||
vm.max_map_count = 500000
|
|
@ -0,0 +1,6 @@
|
|||
# Increase UDP buffer size
|
||||
# https://docs.solana.com/running-validator/validator-start#manual
|
||||
net.core.rmem_default = 134217728
|
||||
net.core.rmem_max = 134217728
|
||||
net.core.wmem_default = 134217728
|
||||
net.core.wmem_max = 134217728
|
|
@ -0,0 +1,10 @@
|
|||
- name: Setup health check
|
||||
hosts: all
|
||||
remote_user: sol
|
||||
tasks:
|
||||
- name: install python dependencies
|
||||
pip:
|
||||
chdir: ~/sol/sol
|
||||
virtualenv: env
|
||||
virtualenv_python: python3.6
|
||||
requirements: "{{ requirements | default('requirements.txt') }}"
|
|
@ -0,0 +1,28 @@
|
|||
all:
|
||||
vars:
|
||||
validator_user: sol
|
||||
solana_version: v1.2.32
|
||||
run_validator: true
|
||||
nginx_sites:
|
||||
- validator.conf
|
||||
|
||||
load_balancers:
|
||||
hosts:
|
||||
validator-lb.test.net:
|
||||
extra_packages:
|
||||
- nginx
|
||||
vars:
|
||||
is_watchtower: true
|
||||
etc_dir: lb
|
||||
run_validator: false
|
||||
|
||||
validators:
|
||||
hosts:
|
||||
validator-1.test.net:
|
||||
validator-2.test.net:
|
||||
vars:
|
||||
etc_dir: validator
|
||||
supervisord_conf_file: validator.conf
|
||||
local_disk: /dev/nvme0n1
|
||||
extra_packages:
|
||||
- nginx
|
|
@ -0,0 +1 @@
|
|||
# add created private key here
|
|
@ -0,0 +1 @@
|
|||
# add public key here
|
|
@ -0,0 +1,27 @@
|
|||
- name: Setup nginx
|
||||
hosts: all
|
||||
remote_user: ubuntu
|
||||
become: yes
|
||||
tasks:
|
||||
- name: linking enabled sites
|
||||
file:
|
||||
src: /etc/nginx/sites-available/{{ item }}
|
||||
dest: /etc/nginx/sites-enabled/{{ item }}
|
||||
state: link
|
||||
with_items: "{{ nginx_sites }}"
|
||||
notify: reload nginx
|
||||
|
||||
- find: file_type=link paths=/etc/nginx/sites-enabled
|
||||
register: sites
|
||||
|
||||
- name: cleaning up others
|
||||
with_items: "{{ sites.files | map(attribute='path') | list }}"
|
||||
file: path={{ item }} state=absent
|
||||
when: "(item | basename) not in nginx_sites"
|
||||
notify: reload nginx
|
||||
|
||||
handlers:
|
||||
- name: reload nginx
|
||||
service:
|
||||
name: nginx
|
||||
state: reloaded
|
|
@ -0,0 +1,72 @@
|
|||
- hosts: validators
|
||||
remote_user: ubuntu
|
||||
become: yes
|
||||
max_fail_percentage: 30
|
||||
serial: 10
|
||||
tasks:
|
||||
- name: remove from load balancers
|
||||
replace:
|
||||
path: /etc/nginx/sites-available/validator.conf
|
||||
regexp: '^(\s*)(server {{ inventory_hostname }}:\d+ .*;)\s*$'
|
||||
replace: '\1# \2 # removed for restart'
|
||||
delegate_to: "{{ item }}"
|
||||
with_items: "{{ groups.load_balancers }}"
|
||||
throttle: 1
|
||||
register: result
|
||||
failed_when: result is not changed
|
||||
|
||||
- name: reload nginx
|
||||
service:
|
||||
name: nginx
|
||||
state: reloaded
|
||||
delegate_to: "{{ item }}"
|
||||
with_items: "{{ groups.load_balancers }}"
|
||||
run_once: yes
|
||||
|
||||
- name: wait for connections to close
|
||||
wait_for:
|
||||
timeout: 10
|
||||
|
||||
- name: restart validator
|
||||
command: supervisorctl restart validator
|
||||
|
||||
- name: wait for validator to start up
|
||||
wait_for:
|
||||
port: 8899
|
||||
delay: 60
|
||||
|
||||
- name: wait for validator to catch up
|
||||
uri:
|
||||
url: http://localhost:8899/health
|
||||
return_content: yes
|
||||
register: result
|
||||
until: "result.content == 'ok'"
|
||||
retries: 200
|
||||
delay: 10
|
||||
|
||||
- name: wait for validator to fully catch up
|
||||
wait_for:
|
||||
timeout: 120
|
||||
|
||||
- name: add back to load balancers
|
||||
replace:
|
||||
path: /etc/nginx/sites-available/validator.conf
|
||||
regexp: '^(\s*)# (server {{ inventory_hostname }}:\d+ .*;) # removed for restart$'
|
||||
replace: '\1\2'
|
||||
delegate_to: "{{ item }}"
|
||||
with_items: "{{ groups.load_balancers }}"
|
||||
throttle: 1
|
||||
register: result
|
||||
failed_when: result is not changed
|
||||
|
||||
- hosts: validators
|
||||
remote_user: ubuntu
|
||||
become: yes
|
||||
tasks:
|
||||
- name: reload nginx one last time
|
||||
service:
|
||||
name: nginx
|
||||
state: reloaded
|
||||
delegate_to: "{{ item }}"
|
||||
with_items: "{{ groups.load_balancers }}"
|
||||
run_once: yes
|
|
@ -0,0 +1,174 @@
|
|||
- name: set up system
|
||||
hosts: all
|
||||
remote_user: ubuntu
|
||||
become: yes
|
||||
tasks:
|
||||
- name: set hostname
|
||||
hostname:
|
||||
name: "{{ inventory_hostname }}"
|
||||
- name: add self to /etc/hosts
|
||||
lineinfile:
|
||||
dest: /etc/hosts
|
||||
regexp: '^127\.0\.0\.1[ \t]+localhost'
|
||||
line: '127.0.0.1 localhost {{ inventory_hostname }}'
|
||||
state: present
|
||||
|
||||
- group:
|
||||
name: "{{ validator_user }}"
|
||||
- user:
|
||||
name: "{{ validator_user }}"
|
||||
group: "{{ validator_user }}"
|
||||
shell: /bin/bash
|
||||
- file:
|
||||
path: "/home/{{ validator_user }}/.ssh"
|
||||
state: directory
|
||||
owner: "{{ validator_user }}"
|
||||
group: "{{ validator_user }}"
|
||||
|
||||
- name: apt repos
|
||||
apt_repository:
|
||||
repo: ppa:deadsnakes/ppa # For python 3.6
|
||||
- name: update packages
|
||||
apt:
|
||||
update_cache: yes
|
||||
upgrade: 'yes'
|
||||
- name: install packages
|
||||
apt:
|
||||
update_cache: yes
|
||||
name:
|
||||
- cron
|
||||
- graphviz
|
||||
- iotop
|
||||
- dnsmasq
|
||||
- supervisor
|
||||
- iputils-ping
|
||||
- less
|
||||
- lsof
|
||||
- psmisc
|
||||
- screen
|
||||
- silversearcher-ag
|
||||
- software-properties-common
|
||||
- vim
|
||||
- zstd
|
||||
|
||||
- python3.6
|
||||
- virtualenv
|
||||
- python3-virtualenv
|
||||
- name: install extra packages # Configured in hosts file
|
||||
apt:
|
||||
name: "{{ extra_packages }}"
|
||||
when: extra_packages is defined
|
||||
|
||||
- name: create log directory
|
||||
file:
|
||||
path: /var/log/sol
|
||||
state: directory
|
||||
owner: "{{ validator_user }}"
|
||||
group: "{{ validator_user }}"
|
||||
|
||||
- name: configure common /etc
|
||||
copy:
|
||||
src: etc/common/
|
||||
dest: /etc/
|
||||
|
||||
- name: configure /etc overrides
|
||||
when: etc_dir is defined
|
||||
copy:
|
||||
src: "etc/{{ etc_dir }}/"
|
||||
dest: /etc/
|
||||
|
||||
- name: evaluate sysctl overrides for udp buffers
|
||||
shell: sudo sysctl -p /etc/sysctl.d/20-solana-udp-buffers.conf
|
||||
when: run_validator
|
||||
|
||||
- name: evaluate sysctl overrides for udp buffers
|
||||
shell: sudo sysctl -p /etc/sysctl.d/20-solana-mmaps.conf
|
||||
when: run_validator
|
||||
|
||||
- name: configure supervisord
|
||||
when: supervisord_conf_file is defined
|
||||
template:
|
||||
src: "supervisord/{{ supervisord_conf_file }}"
|
||||
dest: /etc/supervisor/conf.d/sol.conf
|
||||
|
||||
- name: whitelist github ssh host
|
||||
lineinfile:
|
||||
regexp: "^github\\.com"
|
||||
dest: /etc/ssh/ssh_known_hosts
|
||||
create: yes
|
||||
state: present
|
||||
line: "github.com ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAq2A7hRGmdnm9tUDbO9IDSwBK6TbQa+PXYPCPy6rbTrTtw7PHkccKrpp0yVhp5HdEIcKr6pLlVDBfOLX9QUsyCOV0wzfjIJNlGEYsdlLJizHhbn2mUjvSAHQqZETYP81eFzLQNnPHt4EVVUh7VfDESU84KezmD5QlWpXLmvU31/yMf+Se8xhHTvKSCZIFImWwoG6mbUoWf9nzpIoaSjB+weqqUUmpaaasXVal72J+UX2B+2RPW3RcT0eOzQgqlJL3RKrTJvdsjE3JEAvGq3lGHSZXy28G3skua2SmVi/w4yCE6gbODqnTWlg7+wC604ydGXA8VJiS5ap43JXiUFFAaQ=="
|
||||
|
||||
- name: format local disk
|
||||
filesystem:
|
||||
dev: "{{ local_disk }}"
|
||||
fstype: xfs
|
||||
when: local_disk is defined
|
||||
|
||||
- name: mount local disk
|
||||
mount:
|
||||
path: /data
|
||||
src: "{{ local_disk }}"
|
||||
fstype: xfs
|
||||
opts: defaults,nofail
|
||||
state: mounted
|
||||
when: local_disk is defined
|
||||
|
||||
- name: create sol directory on local disk
|
||||
file:
|
||||
path: /data/sol
|
||||
state: directory
|
||||
owner: "{{ validator_user }}"
|
||||
group: "{{ validator_user }}"
|
||||
when: local_disk is defined
|
||||
|
||||
- import_playbook: nginx-setup.yaml
|
||||
when: nginx_sites is defined
|
||||
|
||||
- name: update shh keys
|
||||
hosts: all
|
||||
remote_user: "{{ validator_user }}"
|
||||
tags:
|
||||
- keys
|
||||
tasks:
|
||||
- name: install priv key
|
||||
copy:
|
||||
src: id_rsa
|
||||
dest: ~/.ssh/
|
||||
mode: '600'
|
||||
- name: install pub key
|
||||
copy:
|
||||
src: id_rsa.pub
|
||||
dest: ~/.ssh/
|
||||
mode: '644'
|
||||
|
||||
- name: update code
|
||||
hosts: all
|
||||
remote_user: "{{ validator_user }}"
|
||||
tags:
|
||||
- code
|
||||
tasks:
|
||||
- name: update git
|
||||
git:
|
||||
repo: git@github.com:wireless-table/validators.git
|
||||
dest: "~/{{ validator_user }}"
|
||||
version: "{{ commit | default('HEAD')}}"
|
||||
|
||||
- import_playbook: health-setup.yaml
|
||||
|
||||
- name: update cli
|
||||
hosts: all
|
||||
remote_user: ubuntu
|
||||
become: yes
|
||||
tags:
|
||||
- cli
|
||||
tasks:
|
||||
- name: install cli
|
||||
shell: sudo --login -u sol -- bash -c "curl -sSf https://raw.githubusercontent.com/solana-labs/solana/{{ solana_version }}/install/solana-install-init.sh | sh -s {{ solana_version }}"
|
||||
|
||||
- hosts: all
|
||||
remote_user: ubuntu
|
||||
become: yes
|
||||
tasks:
|
||||
- name: update supervisorctl
|
||||
command: supervisorctl update
|
|
@ -0,0 +1,18 @@
|
|||
{% macro program(name, module) %}
|
||||
|
||||
[program:{{ name }}]
|
||||
environment=PS={{ name }},TZ=UTC
|
||||
directory=/home/{{ validator_user }}/{{ validator_user }}
|
||||
startsecs=3
|
||||
stopwaitsecs=30
|
||||
user={{ validator_user }}
|
||||
stopasgroup=true
|
||||
startretries=100000
|
||||
autorestart=true
|
||||
redirect_stderr=true
|
||||
stdout_logfile_maxbytes=2000000000
|
||||
stdout_logfile_backups=3
|
||||
{% for key, value in kwargs.items() %}
|
||||
{{ key }}={{ value }}
|
||||
{% endfor %}
|
||||
{% endmacro %}
|
|
@ -0,0 +1,28 @@
|
|||
{% from 'macros' import program with context %}
|
||||
|
||||
[supervisord]
|
||||
minfds=600000
|
||||
|
||||
{% if run_validator %}
|
||||
{{ program('validator', '') }}
|
||||
command=/home/sol/sol/sol/api.sh
|
||||
{% endif %}
|
||||
|
||||
{% if is_watchtower is defined %}
|
||||
{{ program('watchtower', '') }}
|
||||
command=/home/sol/sol/sol/watchtower.sh
|
||||
{% endif %}
|
||||
|
||||
[program:health_check_server]
|
||||
command=/home/sol/sol/sol/env/bin/python -m health.main
|
||||
environment=PS=health_check_server,TZ=UTC
|
||||
directory=/home/sol/sol/sol
|
||||
startsecs=3
|
||||
stopwaitsecs=30
|
||||
user=sol
|
||||
stopasgroup=true
|
||||
startretries=100000
|
||||
autorestart=true
|
||||
redirect_stderr=true
|
||||
stdout_logfile_maxbytes=2000000000
|
||||
stdout_logfile_backups=3
|
|
@ -0,0 +1,5 @@
|
|||
#!/bin/bash -e
|
||||
|
||||
pssh=$(which parallel-ssh || which pssh)
|
||||
|
||||
$pssh -h <(ansible all --list-hosts -i hosts.yaml | tail -n+2) -l ubuntu -P -t0 'sudo tail -n0 -qF /var/log/supervisor/*.log'
|
|
@ -0,0 +1,73 @@
|
|||
#!/usr/bin/env bash
|
||||
set -ex
|
||||
|
||||
#shellcheck source=/dev/null
|
||||
#. ~/service-env.sh
|
||||
PATH=/home/sol/.local/share/solana/install/active_release/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
|
||||
|
||||
# Parameters from https://docs.solana.com/clusters#mainnet-beta
|
||||
ENTRYPOINT=mainnet-beta.solana.com:8001
|
||||
TRUSTED_VALIDATOR_PUBKEYS=(7Np41oeYqPefeNQEHSv1UDhYrehxin3NStELsSKCT4K2 GdnSyH3YtwcxFvQrVVJMm1JhTS4QVX7MFsX56uJLUfiZ DE1bawNcRJB9rVm3buyMVfr8mBEoyyu73NBovf2oXJsJ CakcnaRDHka2gXyfbEd2d3xsvkJkqsLw2akB3zsN1D2S)
|
||||
EXPECTED_BANK_HASH=5eykt4UsFv8P8NJdTREpY1vzqKqZKvdpKuc147dw2N9d
|
||||
EXPECTED_GENESIS_HASH=5eykt4UsFv8P8NJdTREpY1vzqKqZKvdpKuc147dw2N9d
|
||||
EXPECTED_SHRED_VERSION=64864
|
||||
|
||||
# NOTE: Check if this is reasonable
|
||||
RPC_HEALTH_CHECK_SLOT_DISTANCE=15
|
||||
|
||||
# Delete any zero-length snapshots that can cause validator startup to fail
|
||||
find /data/sol/ledger/snapshot-* -size 0 -print -exec rm {} \; || true
|
||||
|
||||
|
||||
identity_keypair=~/api-identity.json
|
||||
|
||||
if [[ -f $identity_keypair ]]; then
|
||||
echo 'identity_keypair exists'
|
||||
else
|
||||
echo 'generating identity_keypair'
|
||||
solana-keygen new -o $identity_keypair --no-passphrase
|
||||
fi
|
||||
|
||||
identity_pubkey=$(solana-keygen pubkey $identity_keypair)
|
||||
|
||||
trusted_validators=()
|
||||
for tv in "${TRUSTED_VALIDATOR_PUBKEYS[@]}"; do
|
||||
[[ $tv = "$identity_pubkey" ]] || trusted_validators+=(--trusted-validator "$tv")
|
||||
done
|
||||
|
||||
if [[ -n "$EXPECTED_BANK_HASH" ]]; then
|
||||
maybe_expected_bank_hash="--expected-bank-hash $EXPECTED_BANK_HASH"
|
||||
fi
|
||||
|
||||
args=(
|
||||
--gossip-port 8001
|
||||
--dynamic-port-range 8002-8012
|
||||
--entrypoint "${ENTRYPOINT}"
|
||||
--ledger /data/sol/ledger
|
||||
--identity "$identity_keypair"
|
||||
--enable-rpc-transaction-history
|
||||
--limit-ledger-size 50000000
|
||||
--cuda
|
||||
--rpc-port 8899
|
||||
--private-rpc
|
||||
--expected-genesis-hash "$EXPECTED_GENESIS_HASH"
|
||||
--expected-shred-version "$EXPECTED_SHRED_VERSION"
|
||||
${maybe_expected_bank_hash}
|
||||
"${trusted_validators[@]}"
|
||||
--no-untrusted-rpc
|
||||
--no-voting
|
||||
--log -
|
||||
--wal-recovery-mode skip_any_corrupted_record
|
||||
)
|
||||
|
||||
if [[ -n "$RPC_HEALTH_CHECK_SLOT_DISTANCE" ]]; then
|
||||
args+=(--health-check-slot-distance "$RPC_HEALTH_CHECK_SLOT_DISTANCE")
|
||||
fi
|
||||
|
||||
# Note: can get into a bad state that requires actually fetching a new snapshot. One such error that indicates this:
|
||||
# "...processing for bank 0 must succeed: FailedToLoadEntries(InvalidShredData(Custom(\"could not reconstruct entries\")))"
|
||||
if [[ -d /data/sol/ledger ]]; then
|
||||
args+=(--no-snapshot-fetch)
|
||||
fi
|
||||
|
||||
exec solana-validator "${args[@]}"
|
|
@ -0,0 +1 @@
|
|||
15
|
|
@ -0,0 +1,3 @@
|
|||
from gevent import monkey
|
||||
|
||||
monkey.patch_all()
|
|
@ -0,0 +1,115 @@
|
|||
import logging
|
||||
import socket
|
||||
import traceback
|
||||
from functools import wraps
|
||||
from pathlib import Path
|
||||
from typing import Union, Tuple, Optional
|
||||
|
||||
import jsonpickle
|
||||
import requests
|
||||
from flask import Flask
|
||||
from flask import jsonify
|
||||
from gevent.pywsgi import WSGIServer
|
||||
|
||||
app = Flask('health.main')
|
||||
logger = logging.getLogger('health.main')
|
||||
|
||||
PORT = 9090
|
||||
TRUSTED_VALIDATOR_ENDPOINT = 'http://vip-api.mainnet-beta.solana.com'
|
||||
LOCAL_VALIDATOR_ENDPOINT = 'http://localhost:8899'
|
||||
UNHEALTHY_BLOCKHEIGHT_DIFF = 15
|
||||
DATA_DIR = 'data'
|
||||
|
||||
|
||||
def serve_flask_app(app: Flask, port: int, allow_remote_connections: bool = False,
|
||||
allow_multiple_listeners: bool = False):
|
||||
listener: Union[socket.socket, Tuple[str, int]]
|
||||
hostname = '' if allow_remote_connections else 'localhost'
|
||||
listener = (hostname, port)
|
||||
if allow_multiple_listeners:
|
||||
listener = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||
listener.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
|
||||
listener.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
|
||||
listener.bind((hostname, port))
|
||||
listener.listen()
|
||||
server = WSGIServer(listener, app)
|
||||
server.serve_forever()
|
||||
|
||||
|
||||
def api_endpoint(f):
|
||||
@wraps(f)
|
||||
def wrapped(*args, **kwargs):
|
||||
try:
|
||||
result = f(*args, **kwargs)
|
||||
return jsonify({'status': 'OK',
|
||||
'result': result})
|
||||
except Exception as e:
|
||||
logger.warning('Error in handler %s', f, exc_info=True)
|
||||
return jsonify({'status': 'Error',
|
||||
'error': repr(e),
|
||||
'pickled_exception': jsonpickle.encode(e),
|
||||
'traceback': traceback.format_exc()}), 500
|
||||
return wrapped
|
||||
|
||||
|
||||
@app.route('/')
|
||||
@api_endpoint
|
||||
def get_status():
|
||||
return f'Hello from {socket.gethostname()}.'
|
||||
|
||||
|
||||
@app.route('/status')
|
||||
@api_endpoint
|
||||
def get_validator_status():
|
||||
local = get_epoch_info(LOCAL_VALIDATOR_ENDPOINT)['result']['blockHeight']
|
||||
trusted = get_epoch_info(TRUSTED_VALIDATOR_ENDPOINT)['result']['blockHeight']
|
||||
return {
|
||||
'local': local,
|
||||
'trusted': trusted
|
||||
}
|
||||
|
||||
|
||||
@app.route('/health')
|
||||
@api_endpoint
|
||||
def get_health_status():
|
||||
local = get_epoch_info(LOCAL_VALIDATOR_ENDPOINT)['result']['blockHeight']
|
||||
trusted = get_epoch_info(TRUSTED_VALIDATOR_ENDPOINT)['result']['blockHeight']
|
||||
diff = trusted - local
|
||||
if diff < 0:
|
||||
logger.info(f'Local block height is greater than trusted validator. '
|
||||
f'Current block height: {local}, '
|
||||
f'Trusted block height: {trusted}')
|
||||
behind = max(0, diff)
|
||||
unhealthy_blockheight_diff = load_data_file_locally('unhealthy_block_threshold') or UNHEALTHY_BLOCKHEIGHT_DIFF
|
||||
if behind > int(unhealthy_blockheight_diff):
|
||||
raise Exception(f'Local validator is behind trusted validator by more than {unhealthy_blockheight_diff} blocks.')
|
||||
return {
|
||||
'local': local,
|
||||
'trusted': trusted
|
||||
}
|
||||
|
||||
|
||||
def load_data_file_locally(filename: str, mode='r') -> Optional[str]:
|
||||
file_path = Path(DATA_DIR) / filename
|
||||
if file_path.exists():
|
||||
with file_path.open(mode=mode) as f:
|
||||
return f.read()
|
||||
return None
|
||||
|
||||
|
||||
def get_epoch_info(url: str):
|
||||
res = requests.post(
|
||||
url,
|
||||
headers={
|
||||
'Content-Type': 'application/json'
|
||||
},
|
||||
json={"jsonrpc":"2.0", "id":1, "method":"getEpochInfo", "params":[]}
|
||||
)
|
||||
res.raise_for_status()
|
||||
return res.json()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
serve_flask_app(
|
||||
app, PORT, allow_remote_connections=True, allow_multiple_listeners=True
|
||||
)
|
|
@ -0,0 +1,33 @@
|
|||
ansible==2.10.0
|
||||
ansible-base==2.10.1
|
||||
certifi==2020.6.20
|
||||
cffi==1.14.3
|
||||
chardet==3.0.4
|
||||
click==7.1.2
|
||||
cryptography==3.1.1
|
||||
Flask==1.1.2
|
||||
gevent==20.9.0
|
||||
gevent-websocket==0.10.1
|
||||
greenlet==0.4.17
|
||||
gunicorn==20.0.4
|
||||
idna==2.10
|
||||
importlib-metadata==2.0.0
|
||||
itsdangerous==1.1.0
|
||||
Jinja2==2.11.2
|
||||
jsonpickle==1.4.1
|
||||
MarkupSafe==1.1.1
|
||||
mypy==0.782
|
||||
mypy-extensions==0.4.3
|
||||
packaging==20.4
|
||||
pycparser==2.20
|
||||
pyparsing==2.4.7
|
||||
PyYAML==5.3.1
|
||||
requests==2.24.0
|
||||
six==1.15.0
|
||||
typed-ast==1.4.1
|
||||
typing-extensions==3.7.4.3
|
||||
urllib3==1.25.10
|
||||
Werkzeug==1.0.1
|
||||
zipp==3.2.0
|
||||
zope.event==4.5.0
|
||||
zope.interface==5.1.2
|
|
@ -0,0 +1,8 @@
|
|||
#!/usr/bin/env bash
|
||||
set -ex
|
||||
|
||||
#shellcheck source=/dev/null
|
||||
#. /home/sol/service-env.sh
|
||||
PATH=/home/sol/.local/share/solana/install/active_release/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
|
||||
|
||||
exec solana-sys-tuner --user sol
|
|
@ -0,0 +1,28 @@
|
|||
#!/usr/bin/env bash
|
||||
set -ex
|
||||
|
||||
#shellcheck source=/dev/null
|
||||
#. ~/service-env.sh
|
||||
PATH=/home/sol/.local/share/solana/install/active_release/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
|
||||
|
||||
TRUSTED_VALIDATOR_PUBKEYS=(7Np41oeYqPefeNQEHSv1UDhYrehxin3NStELsSKCT4K2 GdnSyH3YtwcxFvQrVVJMm1JhTS4QVX7MFsX56uJLUfiZ DE1bawNcRJB9rVm3buyMVfr8mBEoyyu73NBovf2oXJsJ CakcnaRDHka2gXyfbEd2d3xsvkJkqsLw2akB3zsN1D2S)
|
||||
|
||||
VALIDATOR_IDENTITIES=(HiMfCsAvNr5KDaAC4RxzbGtV6TcpeqeTjgNFjCeTHMSw EAqg3S1tHxCmQbwKXFLXBvsWx2Yvh2jyFCqFx5C1s7PM 75Mv8XfC4VxRV7XJ8Ev4DeiJfa2FdbKrAYNc6TUinvkR)
|
||||
|
||||
RPC_URL=http://localhost:8899/
|
||||
|
||||
args=(
|
||||
--url "$RPC_URL" \
|
||||
--monitor-active-stake \
|
||||
--no-duplicate-notifications \
|
||||
)
|
||||
|
||||
for tv in "${VALIDATOR_IDENTITIES[@]}"; do
|
||||
args+=(--validator-identity "$tv")
|
||||
done
|
||||
|
||||
if [[ -n $TRANSACTION_NOTIFIER_SLACK_WEBHOOK ]]; then
|
||||
args+=(--notify-on-transactions)
|
||||
fi
|
||||
|
||||
exec solana-watchtower "${args[@]}"
|
Loading…
Reference in New Issue