Grafana + prometheus setup

Hi all,

Just wanted to check and see how people are setting up their grafana + Prometheus for producer and relay nodes. Especially want to see which ports are being opened for the producer node.

I’m currently running 3x VPS (1 prod, 2x relays) and on the coincashew guide, it mentions being careful not to open the grafana + prom ports on VPS?

What has been the experience with others.

Lastly, I was wondering what other monitoring/notification methods people are using to ensure nodes stay up and running or to get notified if the server crashes?

Thanks in advance

I’m exposing Grafana only through VPN (Wireguard). The only open port to the public is the node port on the Relays.
Grafana could be a central instance for all nodes together or individually running on each server. The central approach should be preferred because then you also have some shared configuration if you want to use monitoring aspects as well.

I see. On coincashew it mentions opening port 51820/udp when setting up wireguard. Is this safer than opening tcp ports for grafana / prom?

You dont happen to have a step by step you used to set up wireguard?
Thanks

I did based on the Wireguard quickstart guide here: Quick Start - WireGuard
But you need to know that it is better to use wg-quick with some file based config because this is on the one hand permanent and also sets up all network rules (which made me run into troubles when I used it the first time).

Regarding your question. Yes it is safer to open the wireguard port only because this way you just have one entry point into your server which is as a VPN endpoint very secure. So you are not open to potential security leaks of Grafana.

Hi,
VPN is surely a good idea, especially Wireguard. In a simpler setup, and probably the most common, you should be good to go as soon as you have prometheus communicating with node-exporters and cardano-nodes metrics endpoints through a firewall filtered port.
Most important thing to start with is to make your servers whitelist each others, and drop everything not specifically allowed (at least your cardano-node port for your relays only (by default :3001) and your ssh port that should be changed also from :22 to another one).

On my pool, I use good ol’ netfilter/iptables for that, it’s working like a charm.

Hi @doc_krieger,

Were you able to find a solution with step-by-step guide? I’m on the same path as well (ie, setting up nodes using coincashew) so was just wondering if you have found a simple setup.

TIA

Unfortunately not. I’m currently working on it and was just about to ask here about some issues I’ve been having.

Currently, I have my pool setup as 3x VPS (1x BP, 2x relay). I followed the wireguard guide on coincashew and was able to setup a tunnel with my BP as the local and my one relay (call it relay#1) as the remote. I confirmed the tunnel is working and was able to follow the rest of the coincashew guide to install prometh and grafana on the two nodes but replaced the IP address mentioned in the relay1 config with the IP address for the tunnel I set up.

My issue so far is that my grafana dash shows my relay metrics but none for my BP so I’m assuming I messed up on the config somehwere. Was hoping somoene could lend me a hand? My prometheus.yml on my relay1 (which is where I have Grafana set up) looks like this:

cat > prometheus.yml << EOF
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'codelab-monitor'

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label job=<job_name> to any timeseries scraped from this config.
  - job_name: 'prometheus'

    static_configs:
      - targets: ['localhost:9100']  
      - targets: ['10.0.0.1:9100']  #IP address for my BP node in VPN
      - targets: ['10.0.0.1:12798'] #IP address for my BP node in VPN
        labels:
          alias: 'block-producer-node'
          type:  'cardano-node'
      - targets: ['localhost:12798']
        labels:
          alias: 'relaynode1'
          type:  'cardano-node'
EOF
sudo mv prometheus.yml /etc/prometheus/prometheus.yml

Again, any help would be greatly appreciated!

Hi doc_krieger,
I’m curious if you solved the tunneling issue from your relay to the block producer? I also set up wireguard and tried to use a similar prometheus job config as you by pointing at the BP node prometheus data ports, 10.0.0.1:9100 and 10.0.0.1:12798.

This does not work, though, because ports 9100 and 12798 are not open by UFW allow rule. If you try to do a telnet command from relay node to BP node, it will just hang.

telnet 10.0.0.1 9100

So, I assume one needs to set up port forwarding in wireguard configuration, so that you can tunnel in via the tunnel port (default 51820), then forward request to another port, i.e. 9100 or 12798. I’d love to set this up on a rainy day, but I ended up taking an easier approach, which I think is reasonably secure.

On my block producer node, I set a ufw rule to allow access to ports 9100 and 12798 only from the relay node IP specifically. Relay Node 1 is running the prometheus service to collect the stats.

# On BP node

sudo ufw allow from <RelayNode_IP> to any port 9100
sudo ufw allow from <RelayNode_IP> to any port 12798

Then, I changed the prometheus job config on relay1 to just use the IP address of BP Node

# On Relay Node 1

- targets: ['<BP_NODE_IP>:9100']
        labels:
          alias: 'block-producer-node'
          type: 'cardano-node'
- targets: ['<BP_NODE_IP>:12798']
        labels:
          alias: 'block-producer-node'
          type:  'cardano-node'

This works of course, but I’d prefer to do the wireguard with port forwarding approach. If you got that working, I’d love to see the config example.

Also, I’m curious how people are securing the grafana service running the relay node 1(port 3000). In my case, I don’t have a static IP on my local machine, so I chose to set up free dynamic DNS client on my local computer to get a DNS name. I was hoping you could set up UFW rule on Relay Node 1 with access to port 3000 only from my DNS name. But, UFW rules are IP based, so I set up a BASH script on the relay node to run cron job as root every 5 mins to update the UFW rule for port 3000 based on the current IP address of my DNS name. It’s kinda messy, but it’s working.

Thanks