Why hide block producing nodes?

waldmops · 8 December 2020 15:04

What is the security rationale to hide block producers behind relays as described here? I don’t see the benefit because both relay and block producer are running the same software, so if someone can take over a relay they could also take over the producer from there using the same exploit. What am I overlooking?

Adrem · 10 December 2020 01:48

Hi @waldmops,

I hope this finds you well. I am not a security expert, but I interpret the relays as an extra safety net, a firewall of sorts. If I follow your reasoning correctly, you are questioning, why bother? I guess you could extend the same reasoning to any precaution/safety measure. If your block producer were open to the wider network, attackers wouldn’t have to go through your relays to wreck havoc, they could come straight in.

Further, I am not sure I can imagine what a configuration would look like without relays. What would your BP connect to instead?

Please don’t take my comments as something to go by, hopefully more people will see this topic and put in comments that may be more informative (and informed) than mine.

All the best,

A

waldmops · 10 December 2020 11:27

Well, a block producer is a full node just like any other, so it would connect to a handful of peers.

Adrem · 10 December 2020 11:42

Fair enough, but not necessarily peers controlled by you or that you can trust. How does that protect your node better than having relays?

Alexd1985 · 10 December 2020 14:59

that’s why your BP should stay connected only with you Relays. don’t run topology updater on BP, only your relays configured in topology.json…only traffic from your relay accepted , etc

Evgeny_S · 10 December 2020 15:17

The idea is that your BP node only talks to your relay nodes, so it’s not exposed to any other traffic, keeping it safe from DDOS and other attacks.
Your relays basically act as DMZ/proxy https://www.barracuda.com/glossary/dmz-network

waldmops · 10 December 2020 18:12

How so? If my relays are under DDOS attack, the producer is cut off from the network as well. What are some of the “other attacks” that we should worry about?

Here is what I considered so far:

DDOS against the pool. Hiding the producer does not help against that
Taking over the node through the hoster (via VNC, serial console or similar). Hiding the producer doesn’t help either because it can be taken over directly
Potential RCE in cardano-node. Attacker would take over the relay first and then the producer

hanswurst · 10 December 2020 20:04

one will require more network traffic to DDoS multiple relays
one could have one or two more relays than published, which would have to be discovered first to also DDoS them…

waldmops · 10 December 2020 20:19

Good point. I hadn’t thought about that.

tigrpool.com · 3 February 2021 22:04

It’s very likely that when someone has the capabilities to DDoS one of your Relays, he is also capable of DDoS ing all of them. I agree with @waldmops that DDoS isnt something you make go away by setting up Relay nodes.
The only way to deal with DDoS is to stop it via something like Cloudflare and/or use of dedicated hardware which shuts off such traffic before it reaches your relay.

Also keep in mind that a real DDoS might continue over multiple days, if you register new relay-nodes they will get attacked too.

@waldmops: One reason i think of is that a BP node has certificates registered, which you do not want to be stolen. By setting up a dedicated Relay (e.g. with the help of docker), you might be able to reduce attack vectors and make a breach less likely.

Markus-VITAL · 22 February 2021 12:24

Has someone been looking into that already? Would be interested to hear about CDN Setups and also if someone already used a WAF in front of a Relay?

Actually the initial questions also bothered me. So in typical web deployments you have something like an Application server and a Web Server. The Application Server is a bigger beast and might expose quite some risk through a lot of potential attack points. For that reason (and also caching, …) a webserver is put in front of it to only forward required traffic (acting as a reverse proxy).

But here we are talking about a server which exposes exactly the same interface. Why is this making the setup more secure? Couldn’t a simple Webserver doing the same or even a better job?

2 Things that seem relevant to me:

Splitting concerns. BP takes care on block creation only. Relay takes over the whole communication. Could be theoretically an advantage if the communication would generate a bottleneck, but I do not think that this is the case
Complexity of an attack: If someone takes over he needs another hop to attack your BP. Anyways if he was successful on the relay there is a high chance that he can repeat the same attack (if get somehow got root on the relay). If there was some Intrusion detection on relay anyways we could identify it hopefully fast enough.

tigrpool.com · 24 February 2021 10:40

Hello @Markus-VITAL:

i am not a cardano dev, hence i can only guess.

I think they chose to use the same node as “relay” to enforce that people have to use it in that way and dont expose their BP nodes directly. Maybe just a pragmatic approach of dealing with it.

Under the assumption that both nodes do the exact same thing, just with different configs (which is something i am not sure about) it would be better to use a error prone and security audited reverse proxy instead of the relay node facing the internet. Underlying assumption would be that such a proxy has less vulnerabilities and is well tested by a huge amount of people - compared to the relay node which just has around 1.6k installations.

ultron001 · 19 March 2021 19:42

It would be wise to only allow trusted relay buddies and close all the other inbound rules on the relay node.

I have my BP node on a private NAT - Only the servers that are in the same network can communicate to it and maintain granular ACLs through Security Groups (AWS)

razzi · 28 March 2021 02:04

Alex, do you need an internet connection at all for the BP? Can it be only be connected via LAN ( private IP ) with the relay nodes?

Alexd1985 · 28 March 2021 04:42

Of course can be connected only via lan

bettr · 7 April 2021 09:11

hey wondering if you ever got any information about implementing a WAF infront of a relay? or even cloudflare?

Markus-VITAL · 7 April 2021 10:50

Hi! No, I did not find anyone so far who operates a WAF or CDN based DDOS protection.
As an alternative to that some operators are running an additional private relay which is not announced to the Network. So in case of a DDOS this node propably would not be exposed to the attack and keeps your validator connected.

To compare:
a) Additional Node, not announced to the network → simple but adds cost for another server
b) WAF-> hard to achieve because propably there is no existing training set for WAF so far and I’m not sure if this works out for the CNODE communication well.
c) CloudFlare → Cloudflare simply removing traffic which caused high number of requests to remove unwanted would actually be much easier and not adding lots of cost. This can be done manually on Firewall level as well.
d) Firewall to only allow traffic from nodes you trust. Propably not a long term solution because P2P from caradno itself is coming and also will require flexibility regarding who is allowed to connect to you.

bettr · 8 April 2021 07:57

Thats an interesting approach, keeping a passive relay on standby. I plan to do that with a load balancer actually. Did you happen to explore that architecture?
Im looking to monitor a few metric points on existing relays and trigger a new relay coming online as-needed (pre-configured just needs to be booted up - already frequently syned db/chain).

Markus-VITAL · 8 April 2021 08:54

Could be passive, could also be active.
If passive it also would be an option to have a daily snapshot to launch it from there with of course some sync latency.
I’m not running this approach for the Relays at the moment.
On the Valdiators I’m running a Hot Standby which is always synced but configured as a Relay but has the required files for running as a Validator as well. So I have it synced and just can restart as a validator when needed. Just doing so for the 1.26.1 migration right now.

Some told my this is an overkill, but I like to have one spare instance.
If I go for the Private Relay Approach above I will also use a Active Private Relay. In my hosting plan a stopped instance is not saving money so an Active one just makes more sense.

Topic		Replies	Views
Why does the relay-node topology file contain the block-producer ip address? Operate a Stake Pool	4	525	13 June 2021
DoS question Stake Pool Security	9	1192	17 September 2021
Block producing node on private IPv4 network Operate a Stake Pool	3	1422	13 October 2020
Security question about DNS with BPN Setup a Stake Pool	2	332	10 May 2021
Block Producer on Testnet without relay? Stake Pool Security	3	878	5 September 2021

Why hide block producing nodes?

Related topics