solana/metrics/metrics-main
Yihau Chen 021d2cdb71
fix: metrics deploy script (#32074)
fix: cert path
2023-06-12 22:15:09 +08:00
..
README.md update influx enterprise scripts (#31117) 2023-04-10 09:10:54 -05:00
alertmanager-discord.sh increase docker mem allocation (#31197) 2023-04-14 03:06:23 -05:00
alertmanager.sh increase docker mem allocation (#31197) 2023-04-14 03:06:23 -05:00
alertmanager.yml
chronograf.sh increase docker mem allocation (#31197) 2023-04-14 03:06:23 -05:00
chronograf_8889.sh increase docker mem allocation (#31197) 2023-04-14 03:06:23 -05:00
first_rules.yml
grafana-metrics.solana.com.ini
grafana.sh increase docker mem allocation (#31197) 2023-04-14 03:06:23 -05:00
host.sh
kapacitor.conf ci: update kapacitor config (#32069) 2023-06-12 20:23:44 +08:00
kapacitor.sh ci: update metrics related deploying code (#32072) 2023-06-12 21:44:30 +08:00
prometheus.sh increase docker mem allocation (#31197) 2023-04-14 03:06:23 -05:00
prometheus.yml fix prometheus path reference (#32003) 2023-06-07 02:56:55 +00:00
start.sh fix: metrics deploy script (#32074) 2023-06-12 22:15:09 +08:00
status.sh ci: update metrics related deploying code (#32072) 2023-06-12 21:44:30 +08:00

README.md

image

Services:

  1. Prometheus
  2. AlertManager
  3. Chronograf (on port 8888)
  4. Chronograf_8889 (on port 8889)
  5. Grafana (on port 3000)
  6. AlertManager_Discord
  7. Kapacitor

To install all the services on the metrics-main server you need to run the start.sh script.

Install the Buildkite-agent to run the status.sh script to periodically check for the status of the containers.

If any of the containers is not in running state or in exited state then it will try to redeploy the container, if it fails to do so an alert will be triggered to Discord and PagerDuty.

Note: If you deleted or removed any of containers manually you need to run the start.sh script.