Question about the structure of the passphrase

Babiba · August 15, 2023, 11:30pm

Do I understand correctly that the passphrase gives access to one of millions of possible indexes created on the seed phrase?
And if my statement is correct, what is the chance that the wallet for example is created not on the millionth index, but on 50 or 100.
And then it turns out that a person knowing only my seed phrase can manually reach this index and withdraw funds without passphrase.
Can I somehow check which index my passphrase belongs to?

matejcik · August 16, 2023, 10:42am

Not at all.

The root of your wallet isn’t actually your seed phrase, but a binary secret blob derived from your seed phrase through a one-way function (specifically PBKDF2 in case of BIP39). The passphrase is a second input to this one-way function.

The properties of PBKDF2 (and others from this family of functions) are such that if you change even a single bit in the input, the output comes out vastly different. The root secrets for your seed + no passphrase and your seed + passphrase a do not look at all alike. They are as different from each other as using a completely different seed would be.

There is no shortcut for the attacker. Either they guess the real passphrase (or try all options by brute force), or they get nothing.

Babiba · August 16, 2023, 1:54pm

It’s not about the attacker picking up my seed phrase.

I know that there are many other wallets on one seed phrase, the documentation says that there are 2^31 (two to the thirty-first degree) which is about 2 billion other wallets created on one seed phrase.
This is mentioned in a number of BIP39 documentation.
For example, here I have generated a wallet and this seed phrase includes many other wallets (screenshot).

I’m not saying that the attacker will try to match my seed phrase, I’m talking about the fact that the passphrase selects a separate wallet on my seed phrase, which means that the hidden wallet is in one of 2 billion variants.

And purely theoretically if the attacker gets my correct seed phrase he will not get access to the wallet that works with the passphrase, but he can start searching other wallets that are created on this seed phrase.
And if my hidden wallet was created not on the billionth wallet, but on the tenth, it will be very easy for him to get to it.

Babiba · August 16, 2023, 2:22pm

I apologize, I forgot to attach the screenshot I wrote about in the previous post

matejcik · August 16, 2023, 2:38pm

This is incorrect.

True, the spec, very confusingly, occasionally uses the word “wallet” to mean “account”. This is just bad documentation – it doesn’t really work in the context of the document itself, and the other specs do not reflect this confusing usage.

In common usage, “BIP-39 wallet” is the same thing as that BIP-32 terms deterministic wallet. A single “deterministic wallet” is derived from a particular seed combined with a particular passphrase. There are no “other wallets”.

(The number 2^31 is the number of subkeys of every key in the BIP-32 hierarchy. In plain language, you could call this a number of “accounts” – but at the same time, it’s the number of different kinds of cryptocurrencies (BIP44 purpose), different coins (BIP44/SLIP44 coin type), different kinds of addresses (BIP44 change), number of addresses of a kind in an account (BIP44 index) and any number of other meanings.)

So, again, no. An attacker with your seed phrase is not picking one of 2^31 wallets. Instead, an attacker with your seed phrase has exactly one wallet (with a large number of accounts, coins, addresses, etc.). If they want to look for a passphrase-protected wallet, they actually need to find the passphrase. The passphrase-derived wallet is not a member of the non-protected hierarchy, it is a completely separate thing.

For what it’s worth, if brute-forcing the passphrase was a matter of trying one of 2^31 combinations, it would not matter whether the right one is number 7 or number 2^31. For two reasons:

A smart attacker is not going to start at 1 and go up. Instead, they will try numbers at random. This will find the right result faster on average.
Brute-forcing 2^31 options is fast. There is a reason your seed phrase is 12 words (2^132) instead of just 3 words (2^33)

Babiba · August 16, 2023, 2:54pm

Do I understand you correctly.

A wallet created with seed phrase + passphrase is not one of the 2^31 wallets created simply with seed phrase.
If I understood the above correctly, how is a wallet that is protected by passphrase created? I would like to find out in order to understand how much it is possible to brute force it at all

matejcik · August 17, 2023, 9:02am

(sigh)
Ok let’s try one more angle.

Your seed, without passphrase, gives you a selection of 2^31 Ethereum addresses. (aside from the trillions of trillions of other things). I’ll even call the addresses “wallets”.

Your seed, with a passphrase, gives you a selection of 2^31 completely different Ethereum addresses.

“A wallet created with seed phrase + passphrase”, in your terms, is not a thing. A passphrase will not give you one new address. Instead, it will give you a completely separate selection.

These are not the same addresses in different order. None of the addresses from the no-passphrase list is in the with-passphrase list.

To be super clear:

you connect your Trezor, open Suite, open a Hidden wallet, enter passphrase abcdef, add an Ethereum account
this gives you one receiving address (because it’s just one account, not because there wouldn’t be more of them)
someone who knows your seed enters it into the BIP39 tool from which you took the screenshot
Your new address is not in that list.
You can go on creating accounts in your Hidden wallet, or in Metamask (where you can pick which address to use.) None of these are in the original list. Adding a passphrase created a completely separate list to choose from.

as I said earlier:

If you want to learn more, I suggest reading the BIP-39 spec.
In general, brute-forcing your passphrase is exactly as difficult as brute-forcing the passphrase. I am again repeating myself: there are no shortcuts for the attacker. They need to try the actual passphrases to get at your keys.

Babiba · August 17, 2023, 2:48pm

Hmm…

Then a wallet created with seed phrase + passphrase will have completely different 2^31 addresses than a wallet created with only seed phrase? Right?

And I don’t understand this statement “In general, brute-forcing your passphrase is exactly as difficult as brute-forcing the passphrase”.
I guess you mean seed phrase&passphrase

matejcik · August 18, 2023, 10:11am

Correct.

This was meant as a response to this:

The answer is that in order to brute-force it, you need to actually attack the passphrase. Not some stand-in, index, intermediate result, etc., but the actual string that you type into Trezor.

In other words, exactly what you would normally expect: passphrase “a” is easier to brute-force than passphrase “D65FShdNiYf5csmcKBHKu2c9sT”. There is no trick that would allow the attacker to just side-step the work. If they don’t know the string “D65FShdNiYf5csmcKBHKu2c9sT”, they cannot get at your coins by another means.

Babiba · August 18, 2023, 11:11am

It all makes sense now, thanks