Why hide block producing nodes?

What is the security rationale to hide block producers behind relays as described here? I don’t see the benefit because both relay and block producer are running the same software, so if someone can take over a relay they could also take over the producer from there using the same exploit. What am I overlooking?

Hi @waldmops,

I hope this finds you well. I am not a security expert, but I interpret the relays as an extra safety net, a firewall of sorts. If I follow your reasoning correctly, you are questioning, why bother? I guess you could extend the same reasoning to any precaution/safety measure. If your block producer were open to the wider network, attackers wouldn’t have to go through your relays to wreck havoc, they could come straight in.

Further, I am not sure I can imagine what a configuration would look like without relays. What would your BP connect to instead?

Please don’t take my comments as something to go by, hopefully more people will see this topic and put in comments that may be more informative (and informed) than mine.

All the best,

A

Well, a block producer is a full node just like any other, so it would connect to a handful of peers.

Fair enough, but not necessarily peers controlled by you or that you can trust. How does that protect your node better than having relays?

that’s why your BP should stay connected only with you Relays. don’t run topology updater on BP, only your relays configured in topology.json…only traffic from your relay accepted , etc

The idea is that your BP node only talks to your relay nodes, so it’s not exposed to any other traffic, keeping it safe from DDOS and other attacks.
Your relays basically act as DMZ/proxy https://www.barracuda.com/glossary/dmz-network

2 Likes

How so? If my relays are under DDOS attack, the producer is cut off from the network as well. What are some of the “other attacks” that we should worry about?

Here is what I considered so far:

  • DDOS against the pool. Hiding the producer does not help against that
  • Taking over the node through the hoster (via VNC, serial console or similar). Hiding the producer doesn’t help either because it can be taken over directly
  • Potential RCE in cardano-node. Attacker would take over the relay first and then the producer
  1. one will require more network traffic to DDoS multiple relays
  2. one could have one or two more relays than published, which would have to be discovered first to also DDoS them…
1 Like

Good point. I hadn’t thought about that.

It’s very likely that when someone has the capabilities to DDoS one of your Relays, he is also capable of DDoS ing all of them. I agree with @waldmops that DDoS isnt something you make go away by setting up Relay nodes.
The only way to deal with DDoS is to stop it via something like Cloudflare and/or use of dedicated hardware which shuts off such traffic before it reaches your relay.

Also keep in mind that a real DDoS might continue over multiple days, if you register new relay-nodes they will get attacked too.

@waldmops: One reason i think of is that a BP node has certificates registered, which you do not want to be stolen. By setting up a dedicated Relay (e.g. with the help of docker), you might be able to reduce attack vectors and make a breach less likely.

Has someone been looking into that already? Would be interested to hear about CDN Setups and also if someone already used a WAF in front of a Relay?

Actually the initial questions also bothered me. So in typical web deployments you have something like an Application server and a Web Server. The Application Server is a bigger beast and might expose quite some risk through a lot of potential attack points. For that reason (and also caching, …) a webserver is put in front of it to only forward required traffic (acting as a reverse proxy).

But here we are talking about a server which exposes exactly the same interface. Why is this making the setup more secure? Couldn’t a simple Webserver doing the same or even a better job?

2 Things that seem relevant to me:

  1. Splitting concerns. BP takes care on block creation only. Relay takes over the whole communication. Could be theoretically an advantage if the communication would generate a bottleneck, but I do not think that this is the case
  2. Complexity of an attack: If someone takes over he needs another hop to attack your BP. Anyways if he was successful on the relay there is a high chance that he can repeat the same attack (if get somehow got root on the relay). If there was some Intrusion detection on relay anyways we could identify it hopefully fast enough.

Hello @Markus-VITAL:

i am not a cardano dev, hence i can only guess.

I think they chose to use the same node as “relay” to enforce that people have to use it in that way and dont expose their BP nodes directly. Maybe just a pragmatic approach of dealing with it.

Under the assumption that both nodes do the exact same thing, just with different configs (which is something i am not sure about) it would be better to use a error prone and security audited reverse proxy instead of the relay node facing the internet. Underlying assumption would be that such a proxy has less vulnerabilities and is well tested by a huge amount of people - compared to the relay node which just has around 1.6k installations.

It would be wise to only allow trusted relay buddies and close all the other inbound rules on the relay node.

I have my BP node on a private NAT - Only the servers that are in the same network can communicate to it and maintain granular ACLs through Security Groups (AWS)

Alex, do you need an internet connection at all for the BP? Can it be only be connected via LAN ( private IP ) with the relay nodes?

Of course can be connected only via lan

hey wondering if you ever got any information about implementing a WAF infront of a relay? or even cloudflare?

Hi! No, I did not find anyone so far who operates a WAF or CDN based DDOS protection.
As an alternative to that some operators are running an additional private relay which is not announced to the Network. So in case of a DDOS this node propably would not be exposed to the attack and keeps your validator connected.

To compare:
a) Additional Node, not announced to the network → simple but adds cost for another server
b) WAF-> hard to achieve because propably there is no existing training set for WAF so far and I’m not sure if this works out for the CNODE communication well.
c) CloudFlare → Cloudflare simply removing traffic which caused high number of requests to remove unwanted would actually be much easier and not adding lots of cost. This can be done manually on Firewall level as well.
d) Firewall to only allow traffic from nodes you trust. Propably not a long term solution because P2P from caradno itself is coming and also will require flexibility regarding who is allowed to connect to you.

Thats an interesting approach, keeping a passive relay on standby. I plan to do that with a load balancer actually. Did you happen to explore that architecture?
Im looking to monitor a few metric points on existing relays and trigger a new relay coming online as-needed (pre-configured just needs to be booted up - already frequently syned db/chain).

Could be passive, could also be active.
If passive it also would be an option to have a daily snapshot to launch it from there with of course some sync latency.
I’m not running this approach for the Relays at the moment.
On the Valdiators I’m running a Hot Standby which is always synced but configured as a Relay but has the required files for running as a Validator as well. So I have it synced and just can restart as a validator when needed. Just doing so for the 1.26.1 migration right now.

Some told my this is an overkill, but I like to have one spare instance.
If I go for the Private Relay Approach above I will also use a Active Private Relay. In my hosting plan a stopped instance is not saving money so an Active one just makes more sense.