[Deployment Architecture] Tigrpool.com is it safe enough? please share feedback on architecture!

Hello together,

i am currently investigating how to safely operate a pool with as less as possible security vulnerabilities.
There is one setup which sounds very promising to me, as it encapsulated the Block Producing node complety from an external interface. Which is the main reason i like this approach the most. But as i am not an expert yet on the cardano node, i would like to ask you to look over it and share concerns ore improvements to that.
General Assumptions:

  • Producing Node and Relay node are both managed via Docker (use of best practises also: like run container only as user, small as possible image, …)
  • Iptables are setup to limit DDos possibilities and drop invalid requests and spoofing attacks.
  • Producing Node binds on 127.0.0.1:3000 (only accessible on the same host)
  • Relay Node binds on 0.0.0.0:3001 (accessible from the internet)
  • Producing Node and Relay node are on the same machine, but isolated with docker.

The relay-topology json file would look like this:

{
“Producers”: [
{
“addr”: “127.0.0.1”,
“port”: 3000,
“valency”: 1
},
{
“addr”: “192.168.0.1”,
“port”: 3001,
“valency”: 1
},
{
“addr”: “192.168.0.2”,
“port”: 3001,
“valency”: 1
},
{
“addr”: “192.168.0.3”,
“port”: 3001,
“valency”: 1
}
]
}
The producing-topology would look like this:
{
“Producers”: [
{
“addr”: “127.0.0.1”,
“port”: 3001,
“valency”: 1
}
]
}

For registering the pool i would like to use:

cardano-cli shelley stake-pool registration-certificate
–cold-verification-key-file cold.vkey
–vrf-verification-key-file vrf.vkey
–pool-pledge
–pool-cost
–pool-margin
–pool-reward-account-verification-key-file stake.vkey
–pool-owner-stake-verification-key-file stake.vkey
–mainnet
*–pool-relay-ipv4 relays.tigrpool_com *
*–pool-relay-port 3001 *
–metadata-url …
–metadata-hash
–out-file …

And on the URL relay.tigrpool.com i would then add three DNS records
57.100.100.1
57.100.100.2
57.100.100.3

Why do i prefer this setup ?
Connecting the Block node to multiple Relays might be possible, but mean that i need to open the port of the Block node to the open internet (which i dont think is a good idea). Isolating the node to only be available to nodes on a whitelist (firewall) is an option, but if the firewall does not work or is misconfigured (unlikely, but possible - even with docker!) the Block node will then be open to the internet (worst case).

Why not connect the Blocks and Relays in the 192.168.0.0/16 subnet ?
It’s a possibility but i dont see a benefit yet. Due to the fact that Relay 1 - Relay 3 are connected they should do all the heavy traffic lifting, while the block node only speaks with one Relay via 127.0.0.1.

If Relay1 is down the Block node wont be able to operate - why not use two or more Relays ?
Yeah that is right, but it depends why the relay node is down.
If the Relay crashes it’s automatically respawned by Docker.
If the Relay node is under DDos it’s external interface is blocked and no external or internal traffic will leave the server. (Exception would be if i connect the node via an external ip to other relay nodes, stated above. But is availability more important than security ? - dont think so.)
If the drive of the machine breaks (for whatever reason) it’s also impossible to operate the node further even if multiple relays exist.

Now my questions:

  1. First of all do you consider this as also being good in terms of security ?
  2. Are there any weaknesses which i might have missed ?
  3. see post below with picture

Update:

  • I had to rename the .json filenames and some cli parameters as the forum tags them as links.
  • Question 3 with Picture added here (due to media limitation for new users)
  1. Would this even work with the cardano-nodes as they are right now ? I would register the pools against relay.tigrpool.com. Assuming someone in the deadalus wallet picks Block node 1 as the pool he wants to be connected to. Then the DNS returns 57.100.100.02 (relay 2) and to contact now the pool on Block node 1 the traffic would need to be routed 57.100.100.02 → 192.168.0.1 → 127.0.0.1:3000.
    But are the relays capable of doing that with the given topology files ? Does the network has a way to determin, how to get to the pool1 ? I mean if thats the case, it would not matter on the registration which relay node (mine or external ones) is submitted, as it would know how to navigate through the network. But i dont think were there yet, and the relay registered must directly know the pool behind. Right or wrong ?

2021-02-03_17h24_03

BUMP: still looking for some feedback on setting up the architecture.

Hi Tigr,

I’m by no means a network architect! But let me give you some feedback on my experience, also with running relays and bpnodes.

Thanks for being so clear on your architecture. That really helps to review it.

I think you should consider what the point of multiple relay nodes is: Redundancy.
Since the relays are free to be connected to by external nodes, they are a lot more prone to go down then BPNodes.
I think we have all seen situations where a relay get stuck in calculation (constant high CPU consumption), and needs to be restarted to get a back to life again.
In your architecture, if a relay goes down or gets stuck, the BPNode is offline as well, since it can only communicate through that one relay.

The way the CAPEX architecture is running:

Basically very comparable to what you are suggesting, but since the BPNodes are shielded from the external world by the firewall, and the ports for the relays are the only ones that are forwarded.

I would advice to run the BPNodes with ports open to the internal network, but not forwarded by the firewall / router.
Then connect all relays to all BPNodes and vv., and you have a redundancy on the parts of your architecture that is taking the biggest burden.
If you are strapped for hardware you could run both a relay and a BPnode on the same OS Instance, but I would get them all on their own, to make sure they re not dependent on each other.

For the registration of your pools you can also use all 3 of the relays, which allows for your completed blocks to dissipate through the network faster.

Just my thoughts, do with it what you will.

1 Like

Hi @tigrpool.com !

Looking at your architecture I think you want to run 3 Pools, not 1.
Each pool is represented by one external IP or domain. You are trying to use the same external domain for all 3 pools. Means you would register 3 pools with the same Relay Domain (which represents the 3 external IPs).

Looking at the Diagram of @wouter_CAPEX this seems to be possible. But I would also agree that it only makes sense if each BP node is connected to multiple Relays, Instead there will be no advantage.

Either completely isolate the environments (in your scenario 3 x (1producer + 1relay)) or do a redundant setup like the CAPEX sample above seem to me like options which make more sense.

1 Like