solana/metrics/metrics-main
joeaba 68d57b1b9f
update influx enterprise scripts (#31117)
* update influx enterprise scripts
2023-04-10 09:10:54 -05:00
..
README.md update influx enterprise scripts (#31117) 2023-04-10 09:10:54 -05:00
alertmanager-discord.sh
alertmanager.sh refactor container status check (#30998) 2023-03-30 22:35:21 -05:00
alertmanager.yml
chronograf.sh fix session timeout (#31057) 2023-04-05 11:12:29 -05:00
chronograf_8889.sh fix session timeout (#31057) 2023-04-05 11:12:29 -05:00
first_rules.yml
grafana-metrics.solana.com.ini
grafana.sh
host.sh
kapacitor.conf
kapacitor.sh
prometheus.sh
prometheus.yml update influx enterprise scripts (#31117) 2023-04-10 09:10:54 -05:00
start.sh fix session timeout (#31057) 2023-04-05 11:12:29 -05:00
status.sh update metrics status scripts (#31037) 2023-04-04 09:03:57 -05:00

README.md

image

Services:

  1. Prometheus
  2. AlertManager
  3. Chronograf (on port 8888)
  4. Chronograf_8889 (on port 8889)
  5. Grafana (on port 3000)
  6. AlertManager_Discord
  7. Kapacitor

To install all the services on the metrics-main server you need to run the start.sh script.

Install the Buildkite-agent to run the status.sh script to periodically check for the status of the containers.

If any of the containers is not in running state or in exited state then it will try to redeploy the container, if it fails to do so an alert will be triggered to Discord and PagerDuty.

Note: If you deleted or removed any of containers manually you need to run the start.sh script.