upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/44.md
diff options
context:
space:
mode:
authorPaul Miller <paul@paulmillr.com>2023-12-20 13:22:02 +0100
committerGitHub <noreply@github.com>2023-12-20 09:22:02 -0300
commit822b70a565678222706dd6284eb7abcaadfc5115 (patch)
tree4cd71d068dfcebc39d418fed61d9c38d0baf752c /44.md
parent09f8244e6fb5271a65a51cbbfe2b9503ee8783f3 (diff)
NIP44 encryption standard, revision 3 (#746)
* Introduce NIP-44 encryption standard * Finalize NIP-44 * Update spec. --------- Co-authored-by: Jonathan Staab <shtaab@gmail.com>
Diffstat (limited to '44.md')
-rw-r--r--44.md296
1 files changed, 296 insertions, 0 deletions
diff --git a/44.md b/44.md
new file mode 100644
index 0000000..1282d26
--- /dev/null
+++ b/44.md
@@ -0,0 +1,296 @@
1# NIP-44
2
3## Encrypted Payloads (Versioned)
4
5`optional` `author:paulmillr` `author:staab`
6
7The NIP introduces a new data format for keypair-based encryption. This NIP is versioned
8to allow multiple algorithm choices to exist simultaneously.
9
10Nostr is a key directory. Every nostr user has their own public key, which solves key
11distribution problems present in other solutions. The goal of this NIP is to have a
12simple way to send messages between nostr accounts that cannot be read by everyone.
13
14The scheme has a number of important shortcomings:
15
16- No deniability: it is possible to prove the event was signed by a particular key
17- No forward secrecy: when a user key is compromised, it is possible to decrypt all previous conversations
18- No post-compromise security: when a user key is compromised, it is possible to decrypt all future conversations
19- No post-quantum security: a powerful quantum computer would be able to decrypt the messages
20- IP address leak: user IP may be seen by relays and all intermediaries between user and relay
21- Date leak: the message date is public, since it is a part of NIP 01 event
22- Limited message size leak: padding only partially obscures true message length
23- No attachments: they are not supported
24
25Lack of forward secrecy is partially mitigated: 1) the messages
26should only be stored on relays, specified by the user, instead of a set of
27all public relays 2) the relays are supposed to regularly delete older messages.
28
29For risky situations, users should chat in specialized E2EE messaging software and limit use
30of nostr to exchanging contacts.
31
32## Dependence on NIP-01
33
34It's not enough to use NIP-44 for encryption: the output must also be signed.
35
36In nostr case, the payload is serialized and signed as per NIP-01 rules.
37
38The same event can be serialized in two different ways,
39resulting in two distinct signatures. So, it's important
40to ensure serialization rules, which are defined in NIP-01,
41are the same across different NIP-44 implementations.
42
43After serialization, the event is signed by Schnorr signature over secp256k1,
44defined in BIP340. It's important to ensure the key and signature validity as
45per BIP340 rules.
46
47## Versions
48
49Currently defined encryption algorithms:
50
51- `0x00` - Reserved
52- `0x01` - Deprecated and undefined
53- `0x02` - secp256k1 ECDH, HKDF, padding, ChaCha20, HMAC-SHA256, base64
54
55## Version 2
56
57The algorithm choices are justified in a following way:
58
59- Encrypt-then-mac-then-sign instead of encrypt-then-sign-then-mac:
60 only events wrapped in NIP-01 signed envelope are currently accepted by nostr.
61- ChaCha instead of AES: it's faster and has
62 [better security against multi-key attacks](https://datatracker.ietf.org/doc/draft-irtf-cfrg-aead-limits/)
63- ChaCha instead of XChaCha: XChaCha has not been standardized. Also, we don't need xchacha's improved
64 collision resistance of nonces: every message has a new (key, nonce) pair.
65- HMAC-SHA256 instead of Poly1305: polynomial MACs are much easier to forge
66- SHA256 instead of SHA3 or BLAKE: it is already used in nostr. Also blake's
67 speed advantage is smaller in non-parallel environments
68- Custom padding instead of padmé: better leakage reduction for small messages
69- Base64 encoding instead of an other compression algorithm: it is widely available,
70 and is already used in nostr
71
72### Functions and operations
73
74- Cryptographic methods
75 - `secure_random_bytes(length)` fetches randomness from CSPRNG
76 - `hkdf(IKM, salt, info, L)` represents HKDF [(RFC 5869)](https://datatracker.ietf.org/doc/html/rfc5869) with SHA256 hash function,
77 comprised of methods `hkdf_extract(IKM, salt)` and `hkdf_expand(OKM, info, L)`
78 - `chacha20(key, nonce, data)` is ChaCha20 [(RFC 8439)](https://datatracker.ietf.org/doc/html/rfc8439), with starting counter set to 0
79 - `hmac_sha256(key, message)` is HMAC [(RFC 2104)](https://datatracker.ietf.org/doc/html/rfc2104)
80 - `secp256k1_ecdh(priv_a, pub_b)` is multiplication of point B by
81 scalar a (`a ⋅ B`), defined in
82 [BIP340](https://github.com/bitcoin/bips/blob/e918b50731397872ad2922a1b08a5a4cd1d6d546/bip-0340.mediawiki).
83 The operation produces shared point, and we encode the shared point's 32-byte x coordinate,
84 using method `bytes(P)` from BIP340. Private and public keys must be validated
85 as per BIP340: pubkey must be a valid, on-curve point, and private key must be a scalar in range `[1, secp256k1_order - 1]`
86- Operators
87 - `x[i:j]`, where `x` is a byte array and `i, j <= 0`,
88 returns a `(j - i)`-byte array with a copy of the `i`-th byte (inclusive) to the `j`-th byte (exclusive) of `x`
89- Constants `c`:
90 - `min_plaintext_size` is 1. 1b msg is padded to 32b.
91 - `max_plaintext_size` is 65535 (64kb - 1). It is padded to 65536.
92- Functions
93 - `base64_encode(string)` and `base64_decode(bytes)` are Base64 ([RFC 4648](https://datatracker.ietf.org/doc/html/rfc4648), with padding)
94 - `concat` refers to byte array concatenation
95 - `is_equal_ct(a, b)` is constant-time equality check of 2 byte arrays
96 - `utf8_encode(string)` and `utf8_decode(bytes)` transform string to byte array and back
97 - `write_u8(number)` restricts number to values 0..255 and encodes into Big-Endian uint8 byte array
98 - `write_u16_be(number)` restricts number to values 0..65535 and encodes into Big-Endian uint16 byte array
99 - `zeros(length)` creates byte array of length `length >= 0`, filled with zeros
100 - `floor(number)` and `log2(number)` are well-known mathematical methods
101
102User-defined functions:
103
104```py
105# Calculates length of the padded byte array.
106def calc_padded_len(unpadded_len):
107 next_power = 1 << (floor(log2(unpadded_len - 1))) + 1
108 if next_power <= 256:
109 chunk = 32
110 else:
111 chunk = next_power / 8
112 if unpadded_len <= 32:
113 return 32
114 else:
115 return chunk * (floor((len - 1) / chunk) + 1)
116
117# Converts unpadded plaintext to padded bytearray
118def pad(plaintext):
119 unpadded = utf8_encode(plaintext)
120 unpadded_len = len(plaintext)
121 if (unpadded_len < c.min_plaintext_size or
122 unpadded_len > c.max_plaintext_size): raise Exception('invalid plaintext length')
123 prefix = write_u16_be(unpadded_len)
124 suffix = zeros(calc_padded_len(unpadded_len) - unpadded_len)
125 return concat(prefix, unpadded, suffix)
126
127# Converts padded bytearray to unpadded plaintext
128def unpad(padded):
129 unpadded_len = read_uint16_be(padded[0:2])
130 unpadded = padded[2:2+unpadded_len]
131 if (unpadded_len == 0 or
132 len(unpadded) != unpadded_len or
133 len(padded) != 2 + calc_padded_len(unpadded_len)): raise Exception('invalid padding')
134 return utf8_decode(unpadded)
135
136# metadata: always 65b (version: 1b, nonce: 32b, max: 32b)
137# plaintext: 1b to 0xffff
138# padded plaintext: 32b to 0xffff
139# ciphertext: 32b+2 to 0xffff+2
140# raw payload: 99 (65+32+2) to 65603 (65+0xffff+2)
141# compressed payload (base64): 132b to 87472b
142def decode_payload(payload):
143 plen = len(payload)
144 if plen == 0 or payload[0] == '#': raise Exception('unknown version')
145 if plen < 132 or plen > 87472: raise Exception('invalid payload size')
146 data = base64_decode(payload)
147 dlen = len(d)
148 if dlen < 99 or dlen > 65603: raise Exception('invalid data size');
149 vers = data[0]
150 if vers != 2: raise Exception('unknown version ' + vers)
151 nonce = data[1:33]
152 ciphertext = data[33:dlen - 32]
153 mac = data[dlen - 32:dlen]
154 return (nonce, ciphertext, mac)
155
156def hmac_aad(key, message, aad):
157 if len(aad) != 32: raise Exception('AAD associated data must be 32 bytes');
158 return hmac(sha256, key, concat(aad, message));
159
160# Calculates long-term key between users A and B: `get_key(Apriv, Bpub) == get_key(Bpriv, Apub)`
161def get_conversation_key(private_key_a, public_key_b):
162 shared_x = secp256k1_ecdh(private_key_a, public_key_b)
163 return hkdf_extract(IKM=shared_x, salt=utf8_encode('nip44-v2'))
164
165# Calculates unique per-message key
166def get_message_keys(conversation_key, nonce):
167 if len(conversation_key) != 32: raise Exception('invalid conversation_key length')
168 if len(nonce) != 32: raise Exception('invalid nonce length')
169 keys = hkdf_expand(OKM=conversation_key, info=nonce, L=76)
170 chacha_key = keys[0:32]
171 chacha_nonce = keys[32:44]
172 hmac_key = keys[44:76]
173 return (chacha_key, chacha_nonce, hmac_key)
174
175def encrypt(plaintext, conversation_key, nonce):
176 (chacha_key, chacha_nonce, hmac_key) = get_message_keys(conversation_key, nonce)
177 padded = pad(plaintext)
178 ciphertext = chacha20(key=chacha_key, nonce=chacha_nonce, data=padded)
179 mac = hmac_aad(key=hmac_key, message=ciphertext, aad=nonce)
180 return base64_encode(concat(write_u8(2), nonce, ciphertext, mac))
181
182def decrypt(payload, conversation_key):
183 (nonce, ciphertext, mac) = decode_payload(payload)
184 (chacha_key, chacha_nonce, hmac_key) = get_message_keys(conversation_key, nonce)
185 calculated_mac = hmac_aad(key=hmac_key, message=ciphertext, aad=nonce)
186 if not is_equal_ct(calculated_mac, mac): raise Exception('invalid MAC')
187 padded_plaintext = chacha20(key=chacha_key, nonce=chacha_nonce, data=ciphertext)
188 return unpad(padded_plaintext)
189
190# Usage:
191# conversation_key = get_conversation_key(sender_privkey, recipient_pubkey)
192# nonce = secure_random_bytes(32)
193# payload = encrypt('hello world', conversation_key, nonce)
194# 'hello world' == decrypt(payload, conversation_key)
195```
196
197#### Encryption
198
1991. Calculate conversation key
200 - Execute ECDH (scalar multiplication) of public key B by private key A.
201 Output `shared_x` must be unhashed, 32-byte encoded x coordinate of the shared point.
202 - Use HKDF-extract with sha256, `IKM=shared_x` and `salt=utf8_encode('nip44-v2')`
203 - HKDF output will be `conversation_key` between two users
204 - It is always the same, when key roles are swapped: `conv(a, B) == conv(b, A)`
2052. Generate random 32-byte nonce
206 - Always use [CSPRNG](https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator)
207 - Don't generate nonce from message content
208 - Don't re-use the same nonce between messages: doing so would make them decryptable,
209 but won't leak long-term key
2103. Calculate message keys
211 - The keys are generated from `conversation_key` and `nonce`. Validate that both are 32 bytes
212 - Use HKDF-expand, with sha256, `OKM=conversation_key`, `info=nonce` and `L=76`
213 - Slice 76-byte HKDF output into: `chacha_key` (bytes 0..32), `chacha_nonce` (bytes 32..44), `hmac_key` (bytes 44..76)
2144. Add padding
215 - Content must be encoded from UTF-8 into byte array
216 - Validate plaintext length. Minimum is 1 byte, maximum is 65535 bytes
217 - Padding format is: `[plaintext_length: u16][plaintext][zero_bytes]`
218 - Padding algorithm is related to powers-of-two, with min padded msg size of 32
219 - Plaintext length is encoded in big-endian as first 2 bytes of the padded blob
2205. Encrypt padded content
221 - Use ChaCha20, with key and nonce from step 3
2226. Calculate MAC (message authentication code) with AAD
223 - AAD is used: instead of calculating MAC on ciphertext,
224 it's calculated over a concatenation of `nonce` and `ciphertext`
225 - Validate that AAD (nonce) is 32 bytes
2267. Base64-encode (with padding) params: `concat(version, nonce, ciphertext, mac)`
227
228After encryption, it's necessary to sign it. Use NIP-01 to serialize the event,
229with result base64 assigned to event's `content`. Then, use NIP-01 to sign
230the event using schnorr signature scheme over secp256k1.
231
232#### Decryption
233
234Before decryption, it's necessary to validate the message's pubkey and signature.
235The public key must be a valid non-zero secp256k1 curve point, and signature must be valid
236secp256k1 schnorr signature. For exact validation rules, refer to BIP-340.
237
2381. Check if first payload's character is `#`
239 - `#` is an optional future-proof flag that means non-base64 encoding is used
240 - The `#` is not present in base64 alphabet, but, instead of throwing `base64 is invalid`,
241 an app must say the encryption version is not yet supported
2422. Decode base64
243 - Base64 is decoded into `version, nonce, ciphertext, mac`
244 - If the version is unknown, the app, an app must say the encryption version is not yet supported
245 - Validate length of base64 message to prevent DoS on base64 decoder: it can be in range from 132 to 87472 chars
246 - Validate length of decoded message to verify output of the decoder: it can be in range from 99 to 65603 bytes
2473. Calculate conversation key
248 - See step 1 of Encryption
2494. Calculate message keys
250 - See step 3 of Encryption
2515. Calculate MAC (message authentication code) with AAD and compare
252 - Stop and throw an error if MAC doesn't match the decoded one from step 2
253 - Use constant-time comparison algorithm
2546. Decrypt ciphertext
255 - Use ChaCha20 with key and nonce from step 3
2567. Remove padding
257 - Read the first two BE bytes of plaintext that correspond to plaintext length
258 - Verify that the length of sliced plaintext matches the value of the two BE bytes
259 - Verify that calculated padding from encryption's step 3 matches the actual padding
260
261## Tests and code
262
263A collection of implementations in different languages is
264available [on GitHub](https://github.com/paulmillr/nip44).
265
266We publish extensive test vectors. Instead of having it in the
267document directly, a sha256 checksum of vectors is provided:
268
269 269ed0f69e4c192512cc779e78c555090cebc7c785b609e338a62afc3ce25040 nip44.vectors.json
270
271Example of test vector from the file:
272
273```json
274{
275 "sec1": "0000000000000000000000000000000000000000000000000000000000000001",
276 "sec2": "0000000000000000000000000000000000000000000000000000000000000002",
277 "conversation_key": "c41c775356fd92eadc63ff5a0dc1da211b268cbea22316767095b2871ea1412d",
278 "nonce": "0000000000000000000000000000000000000000000000000000000000000001",
279 "plaintext": "a",
280 "payload": "AgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABee0G5VSK0/9YypIObAtDKfYEAjD35uVkHyB0F4DwrcNaCXlCWZKaArsGrY6M9wnuTMxWfp1RTN9Xga8no+kF5Vsb"
281}
282```
283
284The file also contains intermediate values. A quick guidance with regards to its usage:
285
286- `valid.get_conversation_key`: calculate conversation_key from secret key sec1 and public key pub2
287- `valid.get_message_keys`: calculate chacha_key, chacha_nocne, hmac_key from conversation_key and nonce
288- `valid.calc_padded_len`: take unpadded length (first value), calculate padded length (second value)
289- `valid.encrypt_decrypt`: emulate real conversation. Calculate
290 pub2 from sec2, verify conversation_key from (sec1, pub2), encrypt, verify payload,
291 then calculate pub1 from sec1, verify conversation_key from (sec2, pub1), decrypt, verify plaintext.
292- `valid.encrypt_decrypt_long_msg`: same as previous step, but instead of a full plaintext and payload,
293 their checksum is provided.
294- `invalid.encrypt_msg_lengths`
295- `invalid.get_conversation_key`: calculating converastion_key must throw an error
296- `invalid.decrypt`: decrypting message content must throw an error