Grafana + prometheus setup

Hi all,

Just wanted to check and see how people are setting up their grafana + Prometheus for producer and relay nodes. Especially want to see which ports are being opened for the producer node.

I’m currently running 3x VPS (1 prod, 2x relays) and on the coincashew guide, it mentions being careful not to open the grafana + prom ports on VPS?

What has been the experience with others.

Lastly, I was wondering what other monitoring/notification methods people are using to ensure nodes stay up and running or to get notified if the server crashes?

Thanks in advance

I’m exposing Grafana only through VPN (Wireguard). The only open port to the public is the node port on the Relays.
Grafana could be a central instance for all nodes together or individually running on each server. The central approach should be preferred because then you also have some shared configuration if you want to use monitoring aspects as well.

I see. On coincashew it mentions opening port 51820/udp when setting up wireguard. Is this safer than opening tcp ports for grafana / prom?

You dont happen to have a step by step you used to set up wireguard?
Thanks

I did based on the Wireguard quickstart guide here: Quick Start - WireGuard
But you need to know that it is better to use wg-quick with some file based config because this is on the one hand permanent and also sets up all network rules (which made me run into troubles when I used it the first time).

Regarding your question. Yes it is safer to open the wireguard port only because this way you just have one entry point into your server which is as a VPN endpoint very secure. So you are not open to potential security leaks of Grafana.

Hi,
VPN is surely a good idea, especially Wireguard. In a simpler setup, and probably the most common, you should be good to go as soon as you have prometheus communicating with node-exporters and cardano-nodes metrics endpoints through a firewall filtered port.
Most important thing to start with is to make your servers whitelist each others, and drop everything not specifically allowed (at least your cardano-node port for your relays only (by default :3001) and your ssh port that should be changed also from :22 to another one).

On my pool, I use good ol’ netfilter/iptables for that, it’s working like a charm.

Hi @doc_krieger,

Were you able to find a solution with step-by-step guide? I’m on the same path as well (ie, setting up nodes using coincashew) so was just wondering if you have found a simple setup.

TIA

Unfortunately not. I’m currently working on it and was just about to ask here about some issues I’ve been having.

Currently, I have my pool setup as 3x VPS (1x BP, 2x relay). I followed the wireguard guide on coincashew and was able to setup a tunnel with my BP as the local and my one relay (call it relay#1) as the remote. I confirmed the tunnel is working and was able to follow the rest of the coincashew guide to install prometh and grafana on the two nodes but replaced the IP address mentioned in the relay1 config with the IP address for the tunnel I set up.

My issue so far is that my grafana dash shows my relay metrics but none for my BP so I’m assuming I messed up on the config somehwere. Was hoping somoene could lend me a hand? My prometheus.yml on my relay1 (which is where I have Grafana set up) looks like this:

cat > prometheus.yml << EOF
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'codelab-monitor'

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label job=<job_name> to any timeseries scraped from this config.
  - job_name: 'prometheus'

    static_configs:
      - targets: ['localhost:9100']  
      - targets: ['10.0.0.1:9100']  #IP address for my BP node in VPN
      - targets: ['10.0.0.1:12798'] #IP address for my BP node in VPN
        labels:
          alias: 'block-producer-node'
          type:  'cardano-node'
      - targets: ['localhost:12798']
        labels:
          alias: 'relaynode1'
          type:  'cardano-node'
EOF
sudo mv prometheus.yml /etc/prometheus/prometheus.yml

Again, any help would be greatly appreciated!

Hi doc_krieger,
I’m curious if you solved the tunneling issue from your relay to the block producer? I also set up wireguard and tried to use a similar prometheus job config as you by pointing at the BP node prometheus data ports, 10.0.0.1:9100 and 10.0.0.1:12798.

This does not work, though, because ports 9100 and 12798 are not open by UFW allow rule. If you try to do a telnet command from relay node to BP node, it will just hang.

telnet 10.0.0.1 9100

So, I assume one needs to set up port forwarding in wireguard configuration, so that you can tunnel in via the tunnel port (default 51820), then forward request to another port, i.e. 9100 or 12798. I’d love to set this up on a rainy day, but I ended up taking an easier approach, which I think is reasonably secure.

On my block producer node, I set a ufw rule to allow access to ports 9100 and 12798 only from the relay node IP specifically. Relay Node 1 is running the prometheus service to collect the stats.

# On BP node

sudo ufw allow from <RelayNode_IP> to any port 9100
sudo ufw allow from <RelayNode_IP> to any port 12798

Then, I changed the prometheus job config on relay1 to just use the IP address of BP Node

# On Relay Node 1

- targets: ['<BP_NODE_IP>:9100']
        labels:
          alias: 'block-producer-node'
          type: 'cardano-node'
- targets: ['<BP_NODE_IP>:12798']
        labels:
          alias: 'block-producer-node'
          type:  'cardano-node'

This works of course, but I’d prefer to do the wireguard with port forwarding approach. If you got that working, I’d love to see the config example.

Also, I’m curious how people are securing the grafana service running the relay node 1(port 3000). In my case, I don’t have a static IP on my local machine, so I chose to set up free dynamic DNS client on my local computer to get a DNS name. I was hoping you could set up UFW rule on Relay Node 1 with access to port 3000 only from my DNS name. But, UFW rules are IP based, so I set up a BASH script on the relay node to run cron job as root every 5 mins to update the UFW rule for port 3000 based on the current IP address of my DNS name. It’s kinda messy, but it’s working.

Thanks

As an aside if you have SSH access already and dont want to open ports you can use an SSH tunnel to access the Grafana Server without needing to open ports. Quick solution no setup no extra things to manage.

ssh -N -L SOURCE-PORT:127.0.0.1:DESTINATION-PORT -i KEYFILE USERNAME@SERVER-IP

2 Likes

What is Keyfile? or where I see this file?

Hello there :slight_smile:

You should really consider setting up an SSL Reverse Proxy on your Grafana Node. It’s very convenient as you can access your monitoring Dashboards from any device (smartphone, laptop etc), in a secure way, without the need to use a SSH tunneling port forwarding :slight_smile:

I wrote guides about this (Nginx reverse proxy installation + hardening + Google OAuth authentication) :

https://cardano-france-stakepool.org/blog/

Some people generate ssh keys to access their servers and disable password/root logins for increased security the -i flag in this ssh command means a path to an identity file(ssh-key). If you are just using a password to login you do not need this field and it should prompt you for a password- but I would recommend reading some online help on generating ssh keys with ssh-keygen and then adding them to your remote servers .ssh/authorized_keys file.

There is a coincashew guide to setting up a stakepool and hardening your server security I would recommend reading it and implementing these practices for most new pool/node operators.

I would definitely recommend disabling password based authentication for SSH. Use key based authentication.

These are some recommended lines in your /etc/ssh/sshd_config file:

ChallengeResponseAuthentication no
PasswordAuthentication no
AuthenticationMethods publickey
PubkeyAuthentication yes

Then take it a step further and also protect your private key with a hardware token. If you don’t do this then it is possible for someone to copy your private key and then they can own your server. However, if you have your private key on a hardware token then this is not possible. Instead they would now need to break into your house, steal your hardware token, AND then also hammer you to reveal your secret passphrase in order to use the hardware token.

For example, you can get a Gnuk Token. I believe you can also use a Trezor T crypto wallet for GnuPG key storage (but I haven’t tried this option myself). I personally can recommend a Gnuk token.

http://www.fsij.org/doc-gnuk/intro.html#what-s-gnuk

I bought some from the free software foundation some years ago and re-programmed them to run Gnuk instead of NeuG which they came running by default. These ones:
https://shop.fsf.org/storage-devices/neug-usb-true-random-number-generator

I think people really should be protecting their SSH keys similarly to how they protect their crypto keys. Everyone talks about using hardware wallets like Trezor T for their crypto (which I recommend too) but then they go around using hot keys for their SSH keys. That doesn’t make sense to me.

1 Like

I could not agree more. I personally use Yubikeys (from Yubico) to store and protect my private ssh keys.

It makes unauthorized ssh access to your servers very very very difficult (as long as you have disabled password authentication, root login etc)

1 Like

I think those are running Gnuk.