- Abi Raja
- Follow me on Twitter @_abi_
I've always been curious how an address is generated from a seed phrase. Let's dive into that journey today. Bitcoin, Ethereum and Solana are substantially similar in how they do this. But Bitcoin and Ethereum are slightly more complicated so we’ll just focus on Solana here.
Here's a sample seed phrase generated using this tool:
crush desk brain index action subject tackle idea trim unveil lawn live
If you import this wallet into Phantom or another Solana Wallet, the first address it will generate for you is
Let's start at the high level. In the diagram below, you can see the four steps in the journey that take you from seed phrase to address. Steps 1 and 2 are the most complex. Steps 3 and 4 are straightforward.
A few things to note before we get started: firstly, this process is deterministic. Every time you put in the same seed phrase, you always get out the same address. There's no randomness in this process.
Secondly, from one master private key, you can derive multiple wallets. The address we referenced earlier is that of the first wallet. But if you go into Phantom and generate a new wallet, you'll get another address which can also be generated deterministically: we just have to change one of the inputs from
1 in Step 2.
Finally, this process is well specified by 3 Bitcoin Improvement Proposals (BIPs for short): BIP-39, BIP-32 and BIP-44. Because these are Bitcoin Improvement Proposals, as you'd expect, Bitcoin wallets implement them as well. In fact, so do a lot of other wallets including Ethereum wallets like Metamask. Why re-invent the wheel when you have a method that works well? For our purposes, Solana too mostly follows the spec with small differences.
Step 1: Seed phrase → Master keys
As you can see, the process is relatively simple, making use of two cryptographic primitives: PBKDF2 and HMAC. First, the seed phrase is treated as the "password" input to PBKDF2. The salt input is set to the constant string "mnemonic". Optionally, you can associate a passphrase to your seed phrase. The passphrase, if it exists, simply gets appended to "mnemonic" before being used as the salt input. Both the seed phrase and salt have to be encoded as UTF-8 NFKD to ensure consistency among implementations.
The PBKDF2 algorithm runs for 2048 iterations using the SHA512 one-way hash function and generates a 512-bit seed.
We compute the HMAC-SHA512 for the 512-bit seed with a key that's set to the fixed constant of "ed25519 seed" (in the case of Bitcoin and Ethereum, the constant is "Bitcoin seed" per BIP-32). “ed25519” refers to the elliptic curve Solana uses: Curve25519. We'll use that curve in Step 3.
The resulting output of the HMAC is also 512 bits just like the input. 512 bits = 64 bytes. The first 32 bytes are the master private key and the latter 32 bytes are the master chain code.
Step 2: Master key → Wallet private key
BIP-32 describes how to construct an hierarchical wallet. So, you are able to derive several wallets from a single master private key.
But not just that, because the wallets are hierarchical: you can derive child wallets from any node in the tree like so:
You can imagine the utility of this: you can give a wallet to your Head of Marketing. They could then give each member of marketing team their own wallet by generating child wallets from their team wallet. If one of the team members loses a private key, that wouldn't comprise the other wallets in the rest of the tree. A key property of this scheme is that given a child wallet, you can't derive a parent wallet or a sibling wallet. You can only derive values down the tree.
Given this pretty neat construction, BIP-44 proposes a specific hierarchy that is now widely adopted across the ecosystem.
In particular, the hierarchy, represented as a path with segments separated by "/" looks like:
m / purpose' / coin_type' / account' / change / address_index
The wallet address we're trying to derive is at the path
m/44'/501'/0'/0' . One caveat here is different wallet applications might use slightly different derivation paths to derive the "first" wallet. I believe that Sollet differs from Phantom.
Let's take a look at each segment in the path:
m stands for master private key (I think) since that's the root key we're going to derive all the child keys from. Next up is purpose, set to
44 which simply indicates that this address conforms to the BIP-44 proposal. Then comes coin type, Solana is assigned
501 per SLIP-44. Ethereum is 60. Then, comes account, starting at index=0. Finally, change can be either 0 or 1. According to BIP-44:
Constant 0 is used for external chain and constant 1 for internal chain (also known as change addresses). External chain is used for addresses that are meant to be visible outside of the wallet (e.g. for receiving payments). Internal chain is used for addresses which are not meant to be visible outside of the wallet and is used for return transaction change.
Since the Phantom-generated address can be used for receiving payments, we chose 0, and that's how we end up with
m/44'/501'/0'/0' . Not entirely sure why we need the
Back to the derivation process. We already have
m. To derive the second segment in this path
44, we follow the algorithm illustrated here:
In the first iteration, we set the private key and chain code to be our master private key and master chain code, respectively. The chain code is used as the key to the HMAC-SHA512. Simple enough. The data input to the HMAC on the other hand is a little more complex: the first byte is fixed to 0, the next 32 bytes are our private key and finally, we append segment plus a constant. What is the segment value? Segment is simply the integer value for current segment in the derivation path. For the first iteration of the loop, it would be
44 . Then,
0 , and finally
After the first iteration, the output is once again split in half: first 32 bytes are the private key and second 32 bytes are the chain code for the next iteration of the loop. We keep looping until we get to the leaf node (
Step 3: Wallet Private Key to Solana Key Pair
We're almost there now. Next, we take the private key we generated and use it as the seed for the ed25519 algorithm to generate a key pair. This is Solana-specific. Ethereum and Bitcoin don't use the curve25519 elliptic curve.
The key pair returned consists of a 64-byte private key and a 32-byte public key.
Step 4: Solana Key Pair to Address
Finally, a Solana address is simply the base58 representation of the public key returned in the previous step.
If we put this all together into code and run it, from our seed phrase
crush desk brain index action subject tackle idea trim unveil lawn live, we'll get