Skip to content
This repository was archived by the owner on Aug 18, 2020. It is now read-only.

Commit d34e0dd

Browse files
Denis ShevchenkoMichael Hueschen
Denis Shevchenko
authored and
Michael Hueschen
committed
[CSLD-163] Improve internal document about HD wallets
1 parent fc88565 commit d34e0dd

File tree

1 file changed

+194
-106
lines changed

1 file changed

+194
-106
lines changed

docs/hd.md

+194-106
Original file line numberDiff line numberDiff line change
@@ -1,124 +1,212 @@
11
# Cardano SL HD Wallets
22

3-
## General scheme
4-
5-
HD wallet has tree-based structure.
6-
The root of the tree is user's public-secret key pair.
7-
Each node in the tree corresponds to public-secret key pair.
8-
9-
Each child node is identified by an index of this child,
10-
so child's public and secret keys can be _derived_ from parent
11-
by this index using parent's public-secret key pair.
12-
On the other hand, one can't learn anything about parent
13-
in reasonable time, unless one has some extra info.
3+
## The problem
144

15-
## Tree structure
16-
17-
In general case tree structure can be arbitrary but
18-
we use specific structure.
19-
We have two layered tree, nodes of the first layer are called "accounts"
20-
and nodes of the second layer (leaves) are called "addresses" and are leaves of a tree.
21-
There can be potentially 2³² accounts and each account can have 2³² addresses,
22-
so wallet can have 2⁶⁴ addresses in total.
23-
24-
The overall scheme is represented on the picture below.
5+
When the user `U` creates a new wallet `W`, he is able to receive money from other
6+
users. To do it, `U` have to generate a new address `A` for his wallet `W`, after
7+
that other users will be able to send money to `A`.
258

26-
(wallet's public key and secret key)
27-
/ | \
28-
(account[0] (pk, sk)) (account[1] (pk, sk)) ... (account[2³²-1] (pk, sk))
29-
/ \ \
30-
(address[0][0] (pk, sk)) ... (address[0][2³²-1] (pk, sk)) ... (address[2³²-1][2³²-1] (pk, sk))
9+
Suppose other user sent 100 ADA to address `A`. Technically it means that new
10+
transaction was created, and this transaction will be a part of some block in the
11+
Cardano blockchain. Later, during synchronization with the blockchain, wallet `W`
12+
will find address `A` and since this address was generated by `W`, this money is
13+
owned by user `U`, so 100 ADA will be shown as a balance of wallet `W`.
3114

32-
So each address belongs to one account.
33-
Address can contain money, be used as change address and so on.
15+
The main problem is to _prove_ that address `A` was generated by wallet `W`.
3416

35-
Balance of each account can be computed as sum of balances of addresses which belong to it,
36-
you can have different accounts for different purposes and so on.
17+
## HD wallet
3718

38-
## Derivation process
19+
HD (Hierarchical Deterministic) wallet has hierarchical, tree-based structure.
20+
The root of this tree is a pair of user's secret key `RootSK` and corresponding
21+
public key `RootPK`. `RootSK` is a key we obtain during creation of the wallet:
22+
by default `RootSK` is generating from 12-words mnemonic (or "wallet backup
23+
phrase").
3924

40-
To derive secret key, parent secret key and passphrase are necessary.
41-
Derivation is done using `deriveHDSecretKey`.
25+
`RootSK` and `RootPK` are crucially important keys:
4226

43-
The deriving of public key can be performed using just a public key of a parent node.
44-
Derivation is done using `deriveHDPublicKey`.
27+
* Wallet's identifier is based on `RootPK`.
28+
* `RootSK` is using to derive new keys for generating new addresses from them (see below).
29+
* `RootPK` is using during synchronization with the blockchain to find out
30+
current balance of this wallet.
4531

46-
It's not allowed to derive a public key from parent public key for
47-
indices which are greater than or equal to 2³¹:
48-
for these indicies parent secret key is required.
49-
This feature is called "hardened derivation":
32+
So, who owns the root keys - he owns the wallet and all its money.
5033

51-
You can get description of underlying cryptography [here](https://cardanolaunch.com/assets/Ed25519_BIP.pdf).
34+
## Tree structure
5235

53-
Note: all these functions don't require derivation level number,
54-
only credential information (public key or secret key with passphrase)
55-
and index of a child.
56-
Also we can derive public and secret keys for any arbitrary number of levels.
36+
In general case structure of the tree can be arbitrary, but currently we use
37+
two layered structure: nodes of the first layer are called "accounts", nodes
38+
of the second layer are called "addresses" (you can think of them as leaves
39+
of the tree). Sometimes layer is called "level". It can be represented like
40+
this:
41+
42+
first second
43+
layer/level layer/level
44+
45+
Root --
46+
`- Account [0] --
47+
` `- Address [0]
48+
` `- Address [1]
49+
` ` ...
50+
` `- Address [M]
51+
`
52+
`- Account [1] --
53+
` `- ...
54+
` ...
55+
`- Account [N] --
56+
`- ...
57+
58+
Each address belongs to one account. There can be potentially 2³² accounts (`N`
59+
is `Word32` value) and each account can have 2³² addresses (`M` is `Word32`
60+
value as well), so wallet can have 2⁶⁴ addresses in total.
61+
62+
_Intuitively_, balance of each account can be computed as a sum of balances of
63+
addresses which belong to it, and therefore balance of the wallet is a sum of
64+
balances of its accounts. The details are however much more subtle, please see
65+
the [formal wallet specification](#formal-specification).
66+
67+
Please note that current structure of the tree will change in some future:
68+
the number of layers will be increased.
69+
70+
## Index and path
71+
72+
As was shown above, each node of the wallet tree has a unique index, it allows
73+
us to identify each node:
74+
75+
Root --
76+
`- Account A [0] --
77+
` `- Address B [0]
78+
` `- Address C [1]
79+
`
80+
`- Account D [1] --
81+
`- Address E [0]
82+
`- Address F [1]
83+
`- Address G [2]
84+
85+
In this example address `B` has index `0`, but address `E` has index `0` too.
86+
To identify address uniquely we specify full path from the root. This path
87+
is a list of node indexes, and since our tree is 2-layered, path contains
88+
indexes of account and address. Thus, address `C` has a path `[0, 1]` and
89+
address `G` has a path `[1, 2]`.
90+
91+
## Derivation, parent and child
92+
93+
To generate a new address for the wallet (addresses in HD wallet are called
94+
HD addresses), we have to obtain a new pair of secret/public keys and generate
95+
new address from it. HD wallet allows us to _derive_ new pair of keys from the
96+
`RootSK`. Actually all the pairs of keys we need to generate all possible
97+
wallet's addresses are derived from the `RootSK`. Let's call such keys _derived_
98+
ones, `dSK` and `dPK`, so address `B` from the last example was generated from
99+
`dSK_B`, address `E` - from `dSK_E` and so on.
100+
101+
To specify _relations_ between nodes of the wallet tree we define _parents_
102+
and _children_. Keys of the child node can be derived from keys of the parent
103+
node. So particular address is a child of the particular account, and all
104+
accounts are children of the root:
105+
106+
parent ......... child
107+
108+
Root keys --
109+
`--> Account [N] keys --
110+
`--> Address [M] keys
111+
112+
parent ................ child
113+
114+
That's why full path from the tree root to the tree leaf (here it is `[N, M]`)
115+
is called "derivation path": to derive child's SK from parent's SK we need not
116+
only parent's SK, but child's index (part of the path) as well:
117+
118+
deriveChildSK :: ChildIndex -> ParentExtSK -> ChildExtSK
119+
120+
Please note that `ParentExtSK` and `ChildExtSK` are _extended_ keys. Extended key
121+
is a pair of ordinary key and special 32-bytes seed (it's called "chain code").
57122

58123
## HD address payload
59124

60-
"Account" and "address" entities were mentioned,
61-
but only addresses appear in the blockchain.
62-
When we create a transaction, we specify address and coins for each `TxOut`.
63-
So neither transaction or block knows anything about accounts and wallets.
64-
65-
We want to be able to track our addresses in the blockchain
66-
to compute balance of wallet and related accounts,
67-
but as it is said in [General scheme](#general-scheme) section we're not able
68-
to determine a parent without some particular info.
69-
So we can't determine account and wallet which address belongs to.
70-
71-
For each leaf let's store path from the root to this leaf along with an address.
72-
So path is a list of derivation indices on each level.
73-
To hide path from other parties we will encrypt it.
74-
This encrypted derivation path is called `HDAddressPayload`
75-
and stored in attributes of `Address` datatype.
76-
77-
Note: length of derivation path in our case is two: `[account index, address index]`.
78-
79-
## Payload encryption
80-
81-
To encrypt derivation path let's use AEAD scheme using ChaCha20 and Poly1305.
82-
More information is available [here](https://tools.ietf.org/html/rfc7539).
83-
Hash of root public key is used as a symmetric key for ChaChaPoly1305.
84-
This hash called `HDPassphrase` and function `deriveHDPassphrase` generates
85-
it by root public key.
86-
87-
Encryption of payload performs `packHDAddressAttr` which takes `HDPassphrase` and
88-
derivation path and returns `HDAddressPayload`.
89-
It serializes derivation path using `Bi` instance for it and then encrypts produced bytes sequence.
90-
91-
## Payload decryption
92-
93-
To decrypt encrypted derivation path we have to derive `HDPassphrase` again
94-
from root public key and then try to decrypt `HDAddressPayload`.
95-
If it's successfully decrypted then it implies address belongs to our tree
96-
and, vice versa, address doesn't belong to our tree if decryption is failed.
97-
98-
Note: publishing of root public key gives other parties opportunity to reveal
99-
all your addresses in the blockchain.
100-
So it is unsafe to share root pk (like it can be done with a usual pk)
101-
because it implies user deanonymization.
102-
103-
## Recovery process
104-
105-
So if we have a root secret key (or even root public key),
106-
we can iterate over the whole blockchain and try to decrypt all met
107-
addresses to determine which of them belong to our wallet.
108-
We retrieve addresses along with whole hierarhy of tree because
109-
decrypted derivation paths describe it.
110-
111-
Note: we are interested only in addresses mentioned in blockchain,
112-
and we consider not mentioned addresses having zero balance.
113-
So we don't store not mentioned addresses anywhere.
125+
"Account" and "address" entities were mentioned, but only addresses appear in
126+
the blockchain. When we create a transaction, we specify destination address
127+
and coins for each output of transaction. So neither transaction or block knows
128+
anything about accounts and wallets.
129+
130+
We want to be able to track our addresses in the blockchain to compute our wallet
131+
balance (because money on our addresses is ours). To prove that particular
132+
address was generated by our wallet we need some special information stored in
133+
the address.
134+
135+
The idea is it: let's store derivation path in the address. For example, when we
136+
derived a new `dSK` for the node with the path `[12456, 10]`, let's generate a
137+
new address `A` from `dSK` and store this path in the `A`. To hide this path from
138+
other parties we will encrypt it (see below). Eventually this serialized and
139+
encrypted derivation path we call `HDAddressPayload` becomes a part of the address.
140+
You can think of this payload as a secret stamp we put on all our addresses:
141+
everyone can see it but only we can read it. This corresponds to the current
142+
implementation, but it is going to change in some future.
143+
144+
To encrypt address payload we use a special passphrase we call `HDPassphrase`.
145+
We don't need to store this passphrase because it can be derived from the
146+
`RootPK` (technically `HDPassphrase` is a HMAC-SHA512 hash of the `RootPK`).
147+
148+
So, during synchronization with the blockchain we take some address, extract
149+
encrypted payload from it and then try to decrypt it using our `HDPassphrase`.
150+
If it's successfully decrypted then it proves that this address belongs to our
151+
wallet and, vice versa, address doesn't belong to our wallet if decryption is
152+
failed.
153+
154+
After successful decryption of all our addresses we get whole hierarhy of the
155+
wallet tree because decrypted derivation paths describe it.
156+
157+
So now it's obviously that if you share your `RootPK` - you give to anyone an
158+
ability to reveal all your addresses in the blockchain!
114159

115160
## Transaction creation
116161

117-
To spend `TxIn` wallet application must sign them using secret key and provide
118-
public key as a witness.
119-
For HD wallets we do the same thing: we derive secret and public key of
120-
address corresponding to `TxIn` from root secret key using user's passphrase
121-
and sign `TxIn` by derived secret key and put public key as a witness.
122-
123-
Note: although we can reveal all addresses just knowing root public key,
124-
to spend `TxIn`s root secret key and user's passphrase are needed.
162+
Suppose that we derived `dSK_A` and generated a new address `A` from it. Some
163+
other user `OU` sent 100 ADA to the address `A`. Technically it means that `OU`
164+
created a new transaction `T1` and its output `O` contains an address `A` and
165+
this amount.
166+
167+
Now we want to send these 100 ADA to another user `AU`. Technically it means
168+
that we have to create a new transaction `T2` and its input `I` will refer to
169+
an output `O` of `T1`:
170+
171+
T1 T2
172+
+--------+ +--------+
173+
| | | |
174+
| +---+ +---+ |
175+
| | O |------>| I | |
176+
| +---+ +---+ |
177+
| | | |
178+
+--------+ +--------+
179+
180+
Now we must prove that we have a right to spend money from the address `A`
181+
(which is in `O`). To do it we must prove that address `A` was generated by
182+
our wallet (because if so, money on this address is ours and we obviously
183+
may spend it).
184+
185+
This is where derivation path helps us: since address `A` contains this path,
186+
we can take it and derive `dSK_A` and corresponding `dPK`. This is how we prove
187+
that address `A` is ours: only we could derive `dPK`, which means that only
188+
we could generate address `A`.
189+
190+
Now we have to create a _witness_ for each input of transaction. A witness is
191+
our proof that we may spend money from corresponding address, it includes
192+
a signature `S` and public key `PK`:
193+
194+
1. `S` is a signature of entire transaction (technically it's a signature of
195+
the hash of the transaction),
196+
2. `PK` is a derived public key that corresponds to derived secret key which
197+
was used to generate our address.
198+
199+
In our example address `A` was generated from `dSK_A`, so witness for `I` will
200+
include:
201+
202+
1. A signature of the transaction `T2`,
203+
2. Public key `dPK_A` that corresponds to `dSK_A`.
204+
205+
Please note: although we can reveal all wallet's addresses just knowing `RootPK`
206+
of this wallet, to spend money `RootSK` is needed.
207+
208+
## Formal specification
209+
210+
Please note that the purpose of this document is to provide a high-level overview
211+
of HD wallets. For technical details please read [Formal specification for a Cardano
212+
wallet](https://cardanodocs.com/files/formal-specification-of-the-cardano-wallet.pdf).

0 commit comments

Comments
 (0)