Stakepool Operation Tools as a potential risk?

Markus-VITAL · 23 February 2021 08:10

There is evidence about the consensus mechanism (Ouroboros) of Cardano being save in terms of preventing key blockchain attack vectors (51%, Double Spender, Bootstrap Problem, Sybil Attack).
Compare https://medium.com/@undersearcher/how-secure-is-cardano-5f1e076be968

So in theory all is good as long as >50% of Stake is managed by honest parties.
But what if parties want to be honest but get compromised without knowing?

Looking on Pooltool.io’s Platform Version diagram there are >50% using CNCLI for example.
I know this number is just about the tool which sends the tip. But I would assume that at least this number of Pools also uses other parts of CNCLI and installs auto-updates without any verification if what code is included in the update, …

My concrete question. Is there a risk of many nodes getting compromised without knowledge.
E.g. through a Auto-Update of CNCLI or some of the other Tools provided by Guild Operators.

Don’t take me wrong now. I’m very thankful for the availability of such tools.
So the key question is if there is a trustful process behind ToolOperators avoiding to include any harmful functionality in the scripts. I assume this is the case already (Pull Requests, Code Reviews, Involvement of IOHK?)

To eliminate remaining risks, just a few questions:

Could there be some verfication mechanism of updates?
Is it possible that a script modifies some key parts of the Cardano Node install? E.g. disable the verification if a block was createe by the node which was meant to create it?
Is there some verification if the Node Install is not modified, like source signing, Hash checks or equivalent?
Is there something which can be done on the server to reduce the risk. E.g. limit outbound traffic, use a different user for running CNCLI compared to the service user of course.

Just thinking of other remaining centrally managed parts of the network of course also the Topology updater seems to be a risk. Like always providing a list of un-honest servers. But I think this would not work out for the attacker based on the Bootstraping Approach in Ouroboros.

Whats your thoughs on that topic?

dstratio · 23 February 2021 08:33

Hi, you raise a very good point

I think it is upon every operator of the pool to ensure that they know what they are doing and realise that the tools are just that.

The tools are there to help with the mundane tasks, but you have to be 100% sure that they are not malicious. For that reason I personally prefer the step by step guides that describe what commands do what, rather than a tool that does everything automatically

The perfect fix for this is elusive as the more people create pools the better (good) but they can create pools that are insecure and as you can be compromised for a 51% attack (bad)

In part emurgo and other are solving for this by delegating their centralised stakes to community members that they see have credibility, so pushing the stake to the trusted members. But at the same time I think the node install process could be enhanced with official helper scripts, which they tools you refer to try to compensate. And option of a docker container to be run by more novice operators is an option, coincidently which iohk already maintain, but there the documentation could be improved a little.

tomdx · 23 February 2021 09:11

Yes, this security concern is real. There are scripts that silently auto-update from the HEAD of some arbitrary git branch. If some nasty or simply buggy commit/pr manages to goes through, this code will potentially proliferate to a large number of installations that use these auto-update scripts. That may in some cases even give root access - the mother of all nightmares.

There currently are quite a few disconnected pieces at work i.e.

compile the node/cli
use a cron/script to do topology updates
come up with your own service config
use some external monitor process
tools that handle registration + update

One possibility to reduce the attack surface, would IMHO be to only use runtime components that are issued by IOHK. Docker images are good at this. They are immutable runtime components carefully crafted with all the necessary functionality backed in such that they are self sufficient (i.e. no external service, cron, etc. needed), available for a variety of target environments and perhaps most importantly released from an official source.

Here is a CIP about this: Provide high quality multiarch docker image and k8s support

Freja · 23 February 2021 09:18

Hi!
I think that is a valid concern. These scripts are very handy but there is no guarantee that they will be forever maintained by the same honest people, or that they will never be compromised in some other way. This is especially important for the scripts (like Topology Updater) that allows you to run an auto update, which pulls a new version of the script if it can find anything on the internet and installs it to run on your node. I try as far as possible to not run any external scripts or tools on my nodes, and I have not yet decided on the Topology Updater since even if you turn the auto update off it will still pull topology files that are not monitored by you.

It is kind of a tradeoff between making it easy and accessible and keeping it safe. I think that the guides that promote different tools, live views, updating scripts etc could be more clear to the reader that it is a good thing to read and understand these scripts and see if they are comfortable with what they do. As well as to not use auto update for a script from the internet.

tomdx · 23 February 2021 09:24

@Freja perhaps you’d like to have a look at this. It provides topology updates, monitoring, leader schedule carefully reviewed and immutable. Ideally, this would come from the official source (i.e. IOHK) and not just some guy who likes Docker - I’m working on that.

Freja · 23 February 2021 09:38

Great work! That would be a better alternative to the different sources of the helpful tools available today, if it can be reviewed and signed by IOHK.

rdlrt · 23 February 2021 13:35

The tools and guides themselves recommend you to keep your keys offline (ideally airgapped) and not make them available online. Thus, even a sacrificed script will never be able to leak your keys. 99% of users skip that due to lack of technical know-how or laziness to set up offline mirror repositories and manage transfer of files manually
The guides ask you to be script aware and set the variables to your liking.
The mentioned scripts now prompt for updates (instead of doing ad-hoc to avoid noob users not being able to run wget/curl command and fall into permission issues)
It is heaps worse to trust a docker / container based solution from security point of view (not only you trust the maintainer of those images, but also the knowledge of user to security shortcomings of docker platform for an outsider).

Markus-VITAL · 23 February 2021 13:52

ad 1) Agree that this is a problem. But even if one does it right. But compromising the node not only is a problem if keys are stored there. Couldnt the install itself be manipulated?

ad 3) Yes, if recognized that, but anyways most of the users will just install it

ad 4) Agree. For me docker was not really an option since I’m not experienced. But anyways I also agree on the comments from @tomdx that having some secured / preconfigured setup directly from IOHK might increase security anyways

rdlrt · 23 February 2021 13:59

No - if Keys are expected to be stored on cold storage/airgapped device, they’re not accessinle to node, neither is internet

That is the crux of your concern , and I agree to this bit

IOHK/anyone can only configure security within image, what happens outside image on the host is stoll on user hands, and not being aware of docker limitations can greatly increase the risks involved.

Essentially, if users follow the procedures to never have their wallet and pool cold keys land on an online server, their trust factor on bad node binary/script is reduced.

Any alternates like docker usage or third party modules to your kernel will only expose your systems even more.

Do not Trust, Verify!

Markus-VITAL · 23 February 2021 14:04

I was not thinking about exposure of the keys. Rather of manipulating the installation itself to harm the way how the node works itself.

rdlrt · 23 February 2021 14:11

Please elaborate what exactly harm the node works means

If you mean not ‘miss a block’ that will easily be caught early and harm the reputation of the members who are around 4 years in the community using codebase that’s on github (and is purely bash script) - keys are not present on the relay nodes, so they cannot leak VRF-based schedule either.

PS: From original thread:

No you cannot just disable block verification on your node and be happy…it will get invalidated immediately on next node
There will in future be multiple versions of node all compatible. The node version should NOT be validated unless you want this to be a single binary - not developing chain. The block format is what is validated, alongwith the leader hash matching the schedule.
The only major risk is if users dont have their keys offline, and whether you use IOG holy grail or not, you expose your set up to much more likely risks.

Freja · 23 February 2021 14:56

It’s pretty far fetched but if you wanted to manipulate the installation would it not be possible to pull a new version with a modified cardano-cli that makes all transactions created by it go to your own address, and hope that most people don’t check their tx.raw properly before signing it on the air gapped machine? That way you could possibly get quite a few stake pool rewards and other tx’s sent to you before people catch up.
But as long as scammers can get their money in easier ways like youtube giveaways and such I don’t think anyone will bother.

Markus-VITAL · 23 February 2021 14:57

Aren’t you assuming here that the next node is a honset one? So thinking of a 51% attack a majority of other nodes would just accept the invalid blick if they are also compromised.

rdlrt · 23 February 2021 15:09

cardano-cli is built from IOG source. If we’re expecting universe to fall apart and everyone from guild group (who have helped onboard the majority of stakepools without marketing) to go Rogue/hacked, then the same assumptions would apply to IOG github account too

The queries/concerns in my eyes, are not as much about the scripts themselves - as much as it is about open source frameworks.

rdlrt · 23 February 2021 15:12

It is PoS , you dont need 51% nodes - but 51% blocks which would be proportional to stake (that would include IOG+Emurgo themselves as one of the biggest stake holders, alongwith binance/other exchanges/third-party wallet providers)

Freja · 23 February 2021 15:14

I did not mean that someone would compromise the IOG version, but rather make a clone from the IOG version with a small modification, put on their own github and deploy on the node instead of the IOG version.

And I agree with you, not very likely to happen, but I would not want my nodes compromised even if they do not have access to my private keys.

Markus-VITAL · 23 February 2021 15:14

Sure, as long as the code base of the widely used tools is save all is good.
Lets see if we get it to a conclusion:
a) Either run none of the scripts ans do everything manually → full control
b) Run the scripts, but only the ones of a trusted source → Risk OK, but some additional security measures should be taken, especially Intrustion detection.
c) Run it in docker with prebuilt configurations of a trusted source (only for those with proper Docker experience)

Markus-VITAL · 23 February 2021 15:15

I’m aware of that, But the hope is that >50% are at some time community controlled.

rdlrt · 23 February 2021 15:18

I dont see how we can reach to conclusions without validating+quantifying your assumptions about risks, but all good… I will retire from this convo

Markus-VITAL · 23 February 2021 15:19

Maybe a scenario like this:
One of the scripts just created a host entry to download the next Update not from Github but from a malicuous github clone… Then at the specific time of the next update all the users are downloading a manipulated code at the same time.

→ Instrusion detection should identify the change in that case

Topic		Replies	Views
Guild Operator scripts are security holes Operate a Stake Pool	11	1263	25 April 2021
topologyUpdater reliance and weakness Setup a Stake Pool cardano	6	743	16 March 2021
A Cardano-Node security script - pool_check Stake Pool Security	1	1525	1 October 2020
Security implications around running a Stakepool Misc Dev Talk testnet	2	832	22 October 2019
New cardano-node security/compliance audit script for SPOs (all setup) Stake Pool Security	17	289	19 May 2025

Stakepool Operation Tools as a potential risk?

Related topics