Determinism of HD wallet addresses and their recoverability from passphrase

So I was playing around with the Daedalus wallet - creating a new wallet, generating a couple of addresses without making any transaction, deleting the wallet and restoring it from the backup file. Of course, the previously generated addresses did not appear since they were empty and no blockchain record was associated with them. But out of curiosity I tried generating them again and I realized that they were totally different, which means that new addresses (i.e. their chain codes) are generated randomly.

So my question is - is there, from the algorithmical point of view, any efficient way to recover my wallet (if it had money on it) from the mnemonic or file backup?

Because the only way that I know of (and that is currently documented and implemented in Daedalus https://cardanodocs.com/technical/hd-wallets/ - “Import” section ), is, if I understand right, by iterating over all the unspent outputs in the block chain, which currently is probably not a big number, but it already takes a significant amount of time and the format of the addresses as documented and as I understand it (https://cardanodocs.com/cardano/addresses/) does not support any way to be efficiently indexed (to verify if an address is yours, you have to take the chain code (derivation path) from the address, try to derive an address with it from your root key and check for equality), so I’m a bit skeptical about how this problem could be addressed in the future if my assumptions are right.

I understand it’s probably for additional safety to generate the addresses randomly, but I find it quite discouraging to have a wallet, that over time, would be virtually unrecoverable, because I would have to check every single address out there whether it is not mine. I would be glad if anybody could disprove my concerns and explain me, what am I missing.

If I was right with my concerns, are there any plans on having wallets that have truly deterministically derived addresses, so they can be easily recovered just by generating the address chain again and checking the block chain records associated with them?

This is a tidbit from the weekly technical report released Feb 1

A team member is to implement a solution for a much faster procedure to restore an HD-wallet from a seed.

No information though on how they are going about it.

2 Likes

Thanks for your answer @regsanman I know that they are working on making the recovery faster, but I would like to understand, how the fast recovery is achievable and sustainable, given the design of the addresses and the way the addresses are currently derived from the root key.

Your wallet(s) resides on your local machine, encrypted, as do the name(s) and passwords you possibly added to send value out.

“Wallet” folder location:

Mac

~/Library/Application Support/Daedalus/

Win

C:\Users\"username"\AppData\Roaming\Daedalus

Backup that “Wallet-” folder and you can recover without having to restore.
I keep mine on an encrypted $5 Raspberry Pi Zero.

The only “address” you need is your passphrase, ground zero for all address you have generated.

Let’s try an experiment:

I will be utilizing the official Cardano Blockchain Explorer to demonstrate.

In Daedalus I generate a new address:
41
I know unused address from used address as unused ones are brighter in color.

This is an unused address:

DdzFFzCqrhtB4R3hvMSTLYrzaKcfHeKr6DYY9ZiYKUW8w6Z4awk7DQvgfW1DJ2iLNjFKNTLPVZRC8npRM3PQhkxumqh1A8VcSuf1u7Bj

I check that address in the Cardano Blockchain Explorer

Yep, unused.

That unused address will now and forever point to my wallet!

If I ever have to restore my wallet, only address that have had activity will be displayed.
Anytime in the future, if an unused address registers transactions then that address will appear in my wallet as will the activity.

The takeaway, your wallet is displaying the data that lives now and forever on the Cardano blockchain, Daedalus is like a mirror reflecting back your blockchain links.

Cheers,

@Chainomatic Yes, from the user’s perspective what you say, totally makes sense, but this question was from a developer’s perspective (hence the “Developers” tag of the question).

Maybe to better illustrate my point: imagine there are 50 billion transactions and 5 billion addresses in the whole block chain and you are recovering from your 12-word passphrase. The passphrase serves to generate a root address which is the “mother” of all subsequent addresses. If the subsequently generated addresses were always the same - bingo, you would just have to regenerate that sequence of addresses and try whether some activity was made with them and stop when there would be a big “window”, let’s say 100 unused transactions. That’s basically how bitcoin wallet recovery works, by the way (as I understand it - feel free to correct me). This process would be quite fast, because I doubt that any normal account would have more than a few thousand addresses and you probably can index them easily in the database, i.e. you can search fast by the address itself.

But what’s happening there with Cardano is, that your addresses are generated randomly, so when you perform the recovery, you cannot just generate the chain of addresses from the start point, you have only the root address, the “mother”, because the transactions related to your account in the block chain were probably from totally different addresses, which you cannot efficiently reproduce because you cannot replicate the randomness for your account. So when you are recovering, the only viable solution, as I understand it, is that the recovery must run basically through all the transaction outputs that are still unspent (let it be 10 billion), and check whether the addresses in them were “generable” from your “mother” address, which you can verify, because the addresses have that random parameter which you miss, encoded in them. But generating a few billion addresses in a row can take really much time, because it isn’t a simple iteration over those few billion addresses, you have to perform for each of them some hashing and encoding, which isn’t as fast as just comparing two strings.

So my question was, if the current design of the addresses, the process of their generation and the blockchain of Cardano enables any more efficient solution. If not, the implications would for example be, that a lightweight wallet, let’s imagine a mobile application where you can manage your Daedalus wallet would probably not support wallet recovery, at least not from the current wallet addresses and even syncing the wallet balance after being offline for a longer time, would be a pain, because they would have to store the complete blockchain to be remotely time and data-efficient, since you cannot throw billions of requests at any blockchain API over the internet. I know that is maybe too soon to ask but I would like to know the limitations of the Cardano currency, if there are any, before investing in it.

I got that, which is why I gave the example and explanation I did.

Think how Daedalus currently restores, now add to that what we covered earlier, unused wallet address will now and forever point to an individual wallet.

Yet, when one restores the current version of Daedalus, the unused address are not brought back, displayed.

Every address has an origin.

We do not want predictable addresses:

If I can predict the address generated I can do very bad things.

Cardano is 100% efficient blockchain.

Stay with the basics, unused wallet address will now and forever point to an individual wallet.

I did not pick the example I used willy-nilly, I posted the steps to help you and anyone else who finds this post.

Generated send address will now and forever point to an individual wallet.
All individual wallets have a root, Daedalus asks only for the pass-phrase to regenerate all transactions, recover an individual wallet.

If I can be so bold, an analysis is not an observation; observe what we have covered here.

Backwards looking, all receive addresses point to a wallet, all passphrases point to a wallet, Daedalus mirrors and displays this reality, currently omitting unused addresses.

Observe, I have a generated receive address (used unused doesn’t matter), I also have a passphrase can I recover faster now?

How about that @Rafael_Korbas, totally secure random address generation, 100% order, and all we had to do was observe what we have in front of us.

Shortest distance between two points, first we need the points {;–)

@Chainomatic ok, but I tried also generating the addresses on an offline computer (using postman and querying the internal cardano-sl API) and then I copied the address, which has never seen the network before into the blockchain explorer (https://cardanoexplorer.com) and it displayed it as an existing empty one, so that is a proof for me that the block chain does not store unused addresses at all, but it still can validate them and any valid address is considered to be empty by default. You can try it by yourself by launching daedalus, disconnecting the computer from the internet, and querying the address https://127.0.0.1:8090/api/address with a POST request and putting in the body of the request the account id. It will return a new address for your wallet even if your PC is offline and the Daedalus wallet is trying to connect to the network. Here is the wallet API documentation: https://cardanodocs.com/technical/wallet/api/

So it isn’t that your newly generated address points from a certain moment to an address - it always has, you just generated it and put it into your pool of displayed addresses. But never mind, this is just a technical detail.

The only thing that concerns me, is that I don’t see any efficient (i.e. sublinearly proportional to the total amount of transactions) way of finding your own transactions in the blockchain, if you know just the root key. All I want is to get a satisfactory explanation - at least an outline of an efficient algorithm to perform the recovery of the wallet addresses from the mnemomic, or root key, which is indeed the same.

“Generated send address will now and forever point to an individual wallet.”

Are you saying that there is no anonymity at all, because from the addresses you can deduce all the wallets they belong to?

@cobblybear - it does not actually point in the sense of being able to to derive the key from the address. So your public addresses are safe. But having the pair (root key, address) you can still verify whether the address was generated from that root key or not, so in that sense, the address is “pointing” to your root key, indeed. At least this is how I understand it. But I still think that Chainomatic’s response lacks the answer to my question, so that’s why am I still asking.

https://cardanodocs.com/cardano/addresses/ - address format
https://cardanodocs.com/technical/hd-wallets/ - derivability of keys

That’s what I thought. However, that is not what he said, that is why I questioned it. I understand your question, and am interested in an answer as well. I guess the best answer we have right now is that it will take time and maturing to figure out what the answer is. Nobody else has it either.

1 Like

@Rafael_Korbas I’m not sure to understand the question (sorry I didn’t read everything thoroughly)
If the adresses were generated in the same sequence, you would still need to check every adresses possible (because you can generate an address and not use it), so I don’t understand how it could help in any way ?

As far as I know, when you want to recover a wallet, in any blockchain, you have to read every transaction and check if it’s yours, it’s absolutely not different with ada than it is with bitcoin.

You gave a link to hd-wallet, I can read there :

The resulting object is serialized and encrypted with symmetric scheme (ChaChaPoly1305 algorithm) with the passphrase computed as SHA-512 hash of the root public key. This will not allow an adversary to map all addresses on the chain to their root as long as we do not actually store any funds on the root key (which is not forced by consensus rules, rather by UI).

I’m not sure about everything, but using symmetric scheme means if I can generate new addresses from a root address, I can also recognize if an address is mine if I have my root address.
So if I have the passphrase, I have my root address, meaning I just have to read every transaction and check if it’s mine, I doubt the algorithm takes a significant time vs getting and reading the transactions.

If the sequence was deterministic, you could iterate just over your sequence of addresses and ask for each of them whether there is some transaction related to them. If for example a thousand addresses in a row don’t have any transaction associated with them, you can quite safely assume that the following ones also weren’t used so you stop searching (the number of unused addresses after you stop searching called a “gap limit”: https://bitcoin.stackexchange.com/questions/50538/how-does-the-client-know-the-number-of-keys-and-coins-when-recovering-from-a-see). Querying for a concrete address should take constant time, which is also the case of Cardano.

As it’s said in the stackexchange thread, if you have a big gap for any reason, your wallet will never be able to get the transactions gap+1th and beyond until more adresses are used. So that’s not a perfect solution either.

I understand your logic, but I can’t agree because there are way too many assumptions, the only way to check would be to look and analyse the daedalus source code.

For example:

Mnemonics is generated on front end side and allows to deterministically generate secret key. Names will not be restored.

So even if Daedalus do not show you the same adresses as before you recovered, it doesn’t mean it’s not generating them deterministically. It may show you semi-random adresses on purpose.

to verify if an address is yours, you have to take the chain code (derivation path) from the address, try to derive an address with it from your root key and check for equality)

As I said before, if the algorythm is symmetric, at no point you try to generate an address and check for equality, you should be able to know if an address is yours as long as you have the root key. And the docs seems to assume that for me:

utxo is traversed to find all addresses with positive balance corresponding to this root key and add them to storage along with their parents (wallets).

There’s a lot of complain about the (very) slow download rate of the blockchain, but I don’t remember someone talking about the recovery time of a wallet, still that would be very interesting to find the code that manage that (assuming it’s not too complicated)