Update on BIP-39 Entropy Distribution Analysis

As I requested in the original thread Potential Weakness in BIP-39 Mnemonic Entropy Distribution Across Multiple Languages and someone else asked you in your issue on Potential Weakness in BIP-39 Mnemonic Entropy Distribution Across Multiple Languages · Issue #134 · trezor/python-mnemonic · GitHub it would very much help if you explained exactly what you were doing to potentially clarify any possible misconceptions.

As far as I can see from the little that you do disclose, you simply generated random phrases from the BIP-39 wordlist and then checked them for validity. In the issue on the Trezor Github you write:

And that is simply false! The wallet apps accept any valid BIP 39 seed phrase. It does not give you any additional information to feed valid seed phrases to tools for one or the other blockchain. Also, that these tools accept them does not mean that someone else has ever used them before, that a wallet app has ever generated this seed phrase. They accept all valid seed phrases. You just generated this wallet. Nobody else did.

This would only be different if you find a wallet that has a transaction history, maybe even a balance, with this method. It is highly unlikely that you even find a single wallet used by someone else, let alone several thousand. If it were that easy, we would see wallets being emptied every day by people just doing the same as you did.

If your sample does show biases, the most plausible explanation is that they stem from your code generating the random phrases. As long as you only check for checksum validity, we don’t have to do any simulations, random experiments.

You can simply read and understand BIP39 to conclude that at least the first n-1 words of n word seed phrases are exactly evenly distributed. For each combination of 11 words, I can find exactly 128 last words to form a valid 12 word seed phrase. For each combination of 23 words, I can find exactly 8 last words to form a valid 24 word seed phrase.

The checksum that is contained in the last word – 4 of 11 bits for 12 words, 8 of 11 bits for 24 words – could theoretically introduce some bias for that last word. But that is also rather theoretical. They simply take the first bits of the SHA256 hash of the entropy. A bias there would mean a bias in SHA256 which is someho unlikely.

Possibility 1: That is simply a bug in whatever tool you were using there. BIP39 phrases have to be multiples of 3. It cannot even work with other lengths. Hard to tell if you don’t tell us which tool it was.
Possibility 2: The tool allows an additional passphrase as described in " From mnemonic to seed" and allows to give that passphrase as “13th word”.

5 Likes