Klever Node Monitoring

:computer: MONITOR YOUR VALIDATOR :computer_mouse:

If you are running a validator on the Klever Blockchain, it is good practice to ensure that you have a good monitoring setup in place. The following will help guide you through a basic install of Prometheus, Prometheus Node Exporter, and Grafana to begin monitoring your nodes.

  1. Install Prometheus Node Exporter
$ sudo apt-get install -y prometheus-node-exporter
$ sudo systemctl enable prometheus-node-exporter.service
  1. Install Prometheus on the monitoring node
$ sudo apt-get install -y prometheus
  1. Install Grafana on the monitoring node
$ wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -

$ echo "deb https://packages.grafana.com/oss/deb stable main" > grafana.list
$ sudo mv grafana.list /etc/apt/sources.list.d/grafana.list

$ sudo apt-get update && sudo apt-get install -y grafana
  1. Enable services so that they start automatically
$ sudo systemctl enable grafana-server.service
$ sudo systemctl enable prometheus.service
$ sudo systemctl enable prometheus-node-exporter.service
  1. Update prometheus.yml located in /etc/prometheus/prometheus.yml
  • Change the IP Address if you are fetching targets from another node as well.
  • File must have correct format spacing and syntax. Otherwise you will get errors on startup.
$ cd /etc/prometheus
$ sudo cp -p prometheus.yml prometheus.yml.ORIG
$ sudo vi prometheus.yml
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'KleverNode'

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    #scrape_interval: 5s
    #scrape_timeout: 5s
    static_configs:
      - targets: ['localhost:9090']
        labels:
          name: 'Node1'
      - targets: ['<IP_Address>:9090']
        labels:
          name: 'Node2'
      - targets: ['localhost:9100']
        labels:
          name: 'Node1'
      - targets: ['<IP_Address>:9100']
        labels:
          name: 'Node2'

  - job_name: 'KleverNodeMetrics'
    metrics_path: /node/metrics
    static_configs:
      - targets: ['localhost:8080']
        labels:
          name: 'Node1'
      - targets: ['<IP_Address>:8080']
        labels:
          name: 'Node2'

  - job_name: 'KleverNodeStats'
    metrics_path: /validator/statistics
    static_configs:
      - targets: ['localhost:8080']
        labels:
          name: 'Node1'
      - targets: ['<IP_Address>:8080']
        labels:
          name: 'Node2'
  1. Restart services and confirm all are running successfully.
$ sudo systemctl restart grafana-server.service
$ sudo systemctl restart prometheus.service
$ sudo systemctl restart prometheus-node-exporter.service
$ sudo systemctl status grafana-server.service prometheus.service prometheus-node-exporter.service
  • If any service from above fails, you will need to do more research on what is wrong. More details can also be found with the following if needed.
$ sudo journalctl -u prometheus-service -f
  1. Setting up the Grafana Dashboard
  • On the monitoring node, open up the following in the browser
http://ipaddress:3000
  • If port 3000 is open from anywhere (or specific IP address for better security) then you should get the login screen.
  • Login with admin / admin and change the password

  • Click the configuration gear icon → Add Data Source → Prometheus
    – Set the name to Prometheus
    – Set URL to http://localhost:9090
    – Click Save & Test

  • If you nodes are in several time zones, it is useful to add the Grafana Clock panel

$ grafana-cli plugins install grafana-clock-panel
  1. You should now be able to create a new Dashboard.
  • Dashboards → New → New Dashboard
  • Enjoy

Example below uses extended metrics not covered in this setup, but shows what you can monitor.

1 Like