Good blockchains, like Cardano, use a 24-word seed phrase for their wallets. That 24-word phrase is all important, it’s literally the keys to the kingdom. You must never let anyone else discover it and you must never loose it yourself.
This is self-custody and decentralized web3 – it’s your responsibility to protect and keep safe your seed phrase. One clever method of securely backing it up offline is to write it down on 3 separate pieces of paper that you store in three separate secure locations. The clever part is that each of the three pieces of paper only has 16 of the seed phrase words. This is done so that a malicious person getting hold of one of them wouldn’t be able to steal the contents of your wallet. Getting hold of any two of the pieces of paper will get you the whole 24-word seed phrase, meaning that if you ever lose the ability to retrieve one (it gets destroyed in a fire, say) you still can get your seed phrase from the other two. This is how it works:
- Piece of paper 1 has words: 1-16
- Piece of paper 2 has words: 9-24
- Piece of paper 3 has words: 1-8 and 17-24
As you can see, it’s a really neat and simple way of providing security so that if one piece of paper is ever found by a baddie, they don’t get access to the whole seed phrase. You don’t have to use paper for this of course, you can use floodproof, fireproof, etc approaches too, such as steel seed phrase holders, to give you even more physical protection if you wish.
But this 3 sets of 16 words approach is really frowned upon by some, because what you are doing is saying that if a lapse in the physical security of storing these pieces of paper occurs, then there are only 8 seed phrase words remaining to be discovered and that’s easily crackable. But is it?
Use a Shamir backup you dumbass!
Before we get into this further, the more knowledgeable in the crowd may be screaming “just use a Shamir backup”! This is another split into three approach, a much better one. With a Shamir backup, you encrypt your seed phrase in a way that creates three completely new seed phrases, which can then be used in the same way, where any 2 of them can re-create the original seed phrase. It’s really neat, really secure, clever maths, but it has one key issue – it relies on using software to create the Shamir backup.
That means you have to trust the software you are going to use to enter your original seed phrase into, to create the three new backup seed phrases. You’re also going to have to totally trust the computer or device you are going to run that software on. How comfortable are you with that? If your hardware wallet itself offered this function for you then great, you could happily use it because it already knows your seed phrase of course – you have to inherently trust it! But for the popular hardware wallets I’ve come across that isn’t an option they provide. When using a new hardware wallet where the seed phrase is only ever presented on the hardware wallet itself, I want to keep the seed phrase perfectly safe by never ever entering it into any computer or other digital device. That’s the whole point of using a hardware wallet!
I really like the 3 pieces of paper approach, its perfectly simple, anyone can easily do it. But I wanted to be sure it was also safe, otherwise there’s no point to it. I think it is – here’s why…
OK, time to get geeky, because we need to for me to be able to present my argument that I think it’s safe enough.
The 24 randomly selected words produce a seed phrase which is 256-bits long. It gives this many possible combinations:
To give you an idea of how big a number that is, if you wanted to brute force crack someone’s whole 24-word seed phrase, you’d expect to find it after going through, on average, half of the possible seed phrase keys, so an average expected number of attempts would be 2^255. If every atom on earth (about 1.3 * 10^50 atoms) was a computer that could try ten billion seed phrase keys a second, it would take about 2.84 billion years. So, it’s nice and secure!
(This is also why any randomly generated seed phrase is basically guaranteed to be unique on a blockchain – the number range is just so massive).
But in the event of one of our bits of paper being discovered we’re dropping the number of unknown words down to 8, much less. But actually, it’s worse than that, because a large portion of the last word of the 24 words contains a checksum that provides no security at all (it’s there to confirm a user entered their seed correctly). So in looking at a worst-case scenario, we need to assume it’s the last 8 words the finder of the piece of paper needs to crack.
It's all about the bits
The 24 words that make up a seed phrase are selected from the BIP39 word list. There are 2048 possible words, which means each word provides 11-bits of the binary seed phrase (words are used as they are easier on humans, but they translate into binary values which is what the computer / cryptography needs).
11-bits per word x 24 words = 264bits. However, the final 8 bits in that 24th word are used as a checksum. So with the last 8 bits removed you get the 256-bit seed phrase. This means that whilst any other group of 8 words is providing 11 bits per word x 8 words = 88 bits of the seed phrase, actually for the last 8 words it’s providing 88 bits – 8 checksum bits, so only 80 bits of the seed phrase. That reduction of 8 bits is significant because we’re into quite short bit lengths already and every bit that is added to a seed phrase doubles the number of possible combinations a brute force attacker needs to try. So 8 bits lost means 8 doublings of the seed phrase space lost.
OK, so assuming our worst-case bit of paper is found by a baddie (the one with words 1-16 on it), we know that they now only need to try every combination of the remaining 80 bits of the binary seed phrase to be able to access our wallet and steal everything in it.
So how secure is an 80-bit seed phrase?
This is where things get a bit tricky for me, because I’ve been unable to find a good bits based online entropy calculator – entropy in a passphrase sense means how long it will take to crack a binary passphrase if sustained attempts to brute force it (try every possible combination) are carried out consistently over a period of time.
I need this to provide a demonstrably true argument on how secure 80 bits is. I can find countless people claiming 8 words of a seed phrase isn’t enough (claiming, not proving why), but also plenty of people in the more general password space saying 80 bits is actually pretty good protection against a brute force attack. I needed to find away to properly prove it out to my own satisfaction.
I’m a software developer of many years and I have a good understanding of computer security, but at a level of understanding what’s needed, not at a level of being able to implement the uber-complex maths involved in encryption and cracking encryption myself. From the resources and calculators I have managed to find I think 80bits is actually enough protection based on current day compute capabilities (and Moore’s law has ended remember). If you think I’m wrong, please comment below and tell me why!
Here's the working through my reasoning...
The best calculator I’ve come across on this topic is the Password Haystack calculator provided by security guru Steve Gibson at GRC. Its not perfect for this use case because like all the other password length calculators I’ve found it is based on the number of characters used in a text password, but we need one based on the number of bits. However it is still really useful due to the information it provides and that it’s created by someone expert and very experienced and well respected in the security community.
Step one – number of combinations
An 80-bit passphrase is 2^80 (2 to the power of 80) which means there are 1,208,925,819,614,629,174,706,176 potential combinations.
As you can see, that’s a hugely smaller number than the full 256-bit passphrase possible combinations we showed earlier! That’s because, for binary passphrases, every additional bit added doubles the number of values. So a 256-bit passphrase isn’t 176 times bigger than an 80-bit passphrase, it’s 176 doublings bigger! That’s how binary bits work.
OK, with only 80 bits it’s still a big number, but is it big enough? Computer GPUs are incredibly good at brute forcing their way through all possible combinations of values, it’s why security uses so many bits in passphrases to protect things. We know that 80 bits isn’t considered best security, in that you’d choose more just to give yourself lots of nice comfy headroom and future proofing. But actually, is it enough?
Step 2 – how long to crack it?
Over to the GRC Password Haystack calculator. OK we’re going to bodge this slightly as it doesn’t let us enter a bit length, but we’re going to do it eer’ing on the side of the worse case. If we enter the following string into its password box:
the calculator tells us that gives a search space of 546,108, 599,233,516,079,517,120.
That’s less than half of the potential combinations our missing 80-bit passphrase comprises of, so easier to crack. Half is a good approximation though when brute forcing, as on average you’d expect to try half the possible brute force combinations to hit the right one.
Now look at the “Time Required to Exhaustively Search this Password’s Space” results. It’s massive.
The “Offline Fast Attack Scenario” gives a result of 1.74 thousand centuries. That’s assuming one hundred billion guesses per second. Do you have any idea what the cost would be to spin up and operate a compute resource capable of carrying out that attack would be? Per hour?! That is completely ignoring that it would still take far far too long!
I think this means backing up a seed phrase using 3 sets of 16 words is safe.
Am I wrong? If you think I am, tell me why in the comments below, I’d love to hear if you think I’ve overlooked something important.
Unknowns / assumptions
- I don’t know what amount of compute resource is needed to turn a brute-force guessed seed phrase into the public address of a wallet, so it can then be looked up on the blockchain to see if it exists. The 256-bit seed phrase is the private key to the wallet, it has to be converted into the public key which is the wallet public address, using a one way cryptographic calculation. My anecdotal evidence tells me it’s a demanding computation that has to be performed for the cryptography used by the Cardano blockchain. However, I don’t have the expertise to give an accurate factual assurance on this – can anyone provide an answer to this in the comments? I’m assuming one hundred billion guesses per second is a worst case potential amount of compute power an attacker could realistically use to rely on the result of 1.74 thousand centuries.
Also bear in mind...
I’ll note here some additional security considerations that came up for this approach and my risk assessment of it:
- Using this backup method has allowed you to keep your seed phrase entirely offline whilst still giving you protection against loss. It’s beautifully simple and involves only pen and paper. In doing so you’ve avoided all the risks of malware on your computer, keystroke loggers, screen grabbers, faulty wallet software, etc. That doesn’t mean we should settle for an insecure solution, but it’s important to remember that this simple approach has kept your seed phrase away from the typical digital means of stealing it. You need a breach in the physical security of where you store your pieces of paper to occur for any attack to be possible in the first place.
- A person who finds one of the pieces of paper has a decision to make on the realistic reward of trying to brute force crack the missing words. They likely have no way of knowing how much value is stored in the wallet that the seed phrase controls. Even if they know you and know what wallet you use, so they can look it up in the blockchain, is there actually a really significant amount of value stored in it? Brute force cracking of seed phrases is very expensive in computing resources, it will cost them a lot of money to bring any sizeable compute resource to bear in an attempt to brute force it. How confident will they be that the seed phrase is protecting a wallet with enough value stored in it to warrant spending a large amount of money on trying to brute force their way into it? I’ve no idea how accurate this cost calculator is, but it produces a cost of $4,064,659,300 for cracking 79 bits (assuming needing to work through half the 80-bit passphrase space to get a match). This is all ignoring the massive cracking time the Password Haystack calculator has given us that tells us its impossible anyway regardless of cost!
- The attacker hasn’t just got to try every combination of those 80bits in their brute force attack, they also need to lookup on the blockchain for each one they try to see if that’s the combination of bits that has the wallet on the blockchain. Every lookup takes time and resource too, which adds to the time a brute force attack would take.