MONITOR YOUR VALIDATOR
If you are running a validator on the Klever Blockchain, it is good practice to ensure that you have a good monitoring setup in place. The following will help guide you through a basic install of Prometheus, Prometheus Node Exporter, and Grafana to begin monitoring your nodes.
- Install Prometheus Node Exporter
$ sudo apt-get install -y prometheus-node-exporter
$ sudo systemctl enable prometheus-node-exporter.service
- Install Prometheus on the monitoring node
$ sudo apt-get install -y prometheus
- Install Grafana on the monitoring node
$ wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
$ echo "deb https://packages.grafana.com/oss/deb stable main" > grafana.list
$ sudo mv grafana.list /etc/apt/sources.list.d/grafana.list
$ sudo apt-get update && sudo apt-get install -y grafana
- Enable services so that they start automatically
$ sudo systemctl enable grafana-server.service
$ sudo systemctl enable prometheus.service
$ sudo systemctl enable prometheus-node-exporter.service
- Update prometheus.yml located in /etc/prometheus/prometheus.yml
- Change the IP Address if you are fetching targets from another node as well.
- File must have correct format spacing and syntax. Otherwise you will get errors on startup.
$ cd /etc/prometheus
$ sudo cp -p prometheus.yml prometheus.yml.ORIG
$ sudo vi prometheus.yml
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'KleverNode'
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
#scrape_interval: 5s
#scrape_timeout: 5s
static_configs:
- targets: ['localhost:9090']
labels:
name: 'Node1'
- targets: ['<IP_Address>:9090']
labels:
name: 'Node2'
- targets: ['localhost:9100']
labels:
name: 'Node1'
- targets: ['<IP_Address>:9100']
labels:
name: 'Node2'
- job_name: 'KleverNodeMetrics'
metrics_path: /node/metrics
static_configs:
- targets: ['localhost:8080']
labels:
name: 'Node1'
- targets: ['<IP_Address>:8080']
labels:
name: 'Node2'
- job_name: 'KleverNodeStats'
metrics_path: /validator/statistics
static_configs:
- targets: ['localhost:8080']
labels:
name: 'Node1'
- targets: ['<IP_Address>:8080']
labels:
name: 'Node2'
- Restart services and confirm all are running successfully.
$ sudo systemctl restart grafana-server.service
$ sudo systemctl restart prometheus.service
$ sudo systemctl restart prometheus-node-exporter.service
$ sudo systemctl status grafana-server.service prometheus.service prometheus-node-exporter.service
- If any service from above fails, you will need to do more research on what is wrong. More details can also be found with the following if needed.
$ sudo journalctl -u prometheus-service -f
- Setting up the Grafana Dashboard
- On the monitoring node, open up the following in the browser
http://ipaddress:3000
- If port 3000 is open from anywhere (or specific IP address for better security) then you should get the login screen.
- Login with admin / admin and change the password
-
Click the configuration gear icon → Add Data Source → Prometheus
– Set the name to Prometheus
– Set URL tohttp://localhost:9090
– Click Save & Test -
If you nodes are in several time zones, it is useful to add the Grafana Clock panel
$ grafana-cli plugins install grafana-clock-panel
- You should now be able to create a new Dashboard.
- Dashboards → New → New Dashboard
- Enjoy
Example below uses extended metrics not covered in this setup, but shows what you can monitor.