diff options
| author | Alex Gleason <alex@alexgleason.me> | 2026-04-10 13:31:37 -0500 |
|---|---|---|
| committer | Alex Gleason <alex@alexgleason.me> | 2026-04-10 13:31:37 -0500 |
| commit | 5e1e24766910fc07cb61a049aed2623987458ec2 (patch) | |
| tree | b7588f61fddf9374268d5cd6f4e3f2655d7c840a /44.md | |
| parent | b8782df594b4e7e8f088869134908eed58be6078 (diff) | |
| parent | 3465f540e3eaedccb5309711b502f0febf56b52f (diff) | |
Merge nip44-big-payloads into bigger-nip44bigger-nip44
Diffstat (limited to '44.md')
| -rw-r--r-- | 44.md | 115 |
1 files changed, 78 insertions, 37 deletions
| @@ -84,12 +84,12 @@ NIP-44 version 2 has the following design characteristics: | |||
| 84 | - Slice 76-byte HKDF output into: `chacha_key` (bytes 0..32), `chacha_nonce` (bytes 32..44), `hmac_key` (bytes 44..76) | 84 | - Slice 76-byte HKDF output into: `chacha_key` (bytes 0..32), `chacha_nonce` (bytes 32..44), `hmac_key` (bytes 44..76) |
| 85 | 4. Add padding | 85 | 4. Add padding |
| 86 | - Content must be encoded from UTF-8 into byte array | 86 | - Content must be encoded from UTF-8 into byte array |
| 87 | - Validate plaintext length. Minimum is 1 byte, maximum is 4294967296 bytes | 87 | - Validate plaintext length. Minimum is 1 byte, maximum is 4,294,967,295 bytes |
| 88 | - Padding format is: `[plaintext_length: u16][plaintext][zero_bytes]` | ||
| 89 | - Padding algorithm is related to powers-of-two, with min padded msg size of 32 bytes | 88 | - Padding algorithm is related to powers-of-two, with min padded msg size of 32 bytes |
| 90 | - Plaintext length is encoded in big-endian: | 89 | - Plaintext length prefix is encoded in big-endian: |
| 91 | - if smaller than 65536, as a u16 in the first 2 bytes of the padded blob; | 90 | - If length is less than 65536: prefix is 2 bytes (`u16`), format is `[plaintext_length: u16][plaintext][zero_bytes]` |
| 92 | - if greater than 65536, the first 6 bytes of the padded blob, the first 2 being zero and the other 4 being the actual encoded length as u32 | 91 | - If length is 65536 or greater: prefix is 6 bytes (2 zero bytes + `u32`), format is `[0x00, 0x00][plaintext_length: u32][plaintext][zero_bytes]` |
| 92 | - A zero value in the first 2 bytes signals the extended format; since valid plaintext is at least 1 byte, a u16 length of 0 is otherwise invalid | ||
| 93 | 5. Encrypt padded content | 93 | 5. Encrypt padded content |
| 94 | - Use ChaCha20, with key and nonce from step 3 | 94 | - Use ChaCha20, with key and nonce from step 3 |
| 95 | 6. Calculate MAC (message authentication code) | 95 | 6. Calculate MAC (message authentication code) |
| @@ -114,8 +114,8 @@ validation rules, refer to BIP-340. | |||
| 114 | 2. Decode base64 | 114 | 2. Decode base64 |
| 115 | - Base64 is decoded into `version, nonce, ciphertext, mac` | 115 | - Base64 is decoded into `version, nonce, ciphertext, mac` |
| 116 | - If the version is unknown, implementations must indicate that the encryption version is not supported | 116 | - If the version is unknown, implementations must indicate that the encryption version is not supported |
| 117 | - Validate length of base64 message to prevent DoS on base64 decoder: it can be in range from 132 to 87472 chars | 117 | - Validate minimum length of base64 message to prevent DoS on base64 decoder: it must be at least 132 chars |
| 118 | - Validate length of decoded message to verify output of the decoder: it can be in range from 99 to 65603 bytes | 118 | - Validate minimum length of decoded message to verify output of the decoder: it must be at least 99 bytes |
| 119 | 3. Calculate conversation key | 119 | 3. Calculate conversation key |
| 120 | - See step 1 of [encryption](#Encryption) | 120 | - See step 1 of [encryption](#Encryption) |
| 121 | 4. Calculate message keys | 121 | 4. Calculate message keys |
| @@ -126,12 +126,23 @@ validation rules, refer to BIP-340. | |||
| 126 | 6. Decrypt ciphertext | 126 | 6. Decrypt ciphertext |
| 127 | - Use ChaCha20 with key and nonce from step 3 | 127 | - Use ChaCha20 with key and nonce from step 3 |
| 128 | 7. Remove padding | 128 | 7. Remove padding |
| 129 | - Read the first 2 bytes, | 129 | - Read the first 2 bytes as a big-endian u16 |
| 130 | - if they're zero, read the next 4 bytes as the u32 big-endian plaintext length; | 130 | - If zero, read the next 4 bytes as a big-endian u32 plaintext length (6-byte prefix total) |
| 131 | - otherwise interpret those 2 bytes as the u16 plaintext length | 131 | - Otherwise, use those 2 bytes as the u16 plaintext length (2-byte prefix total) |
| 132 | - Verify that the length of sliced plaintext matches the value of the two BE bytes | 132 | - Verify that the length of sliced plaintext matches the decoded length |
| 133 | - Verify that calculated padding from step 3 of the [encryption](#Encryption) process matches the actual padding | 133 | - Verify that calculated padding from step 3 of the [encryption](#Encryption) process matches the actual padding |
| 134 | 134 | ||
| 135 | ### Implementation considerations | ||
| 136 | |||
| 137 | The theoretical maximum plaintext size is 2^32 - 1 bytes (~4 GB). Implementations SHOULD enforce | ||
| 138 | their own maximum payload size based on platform and resource constraints, rejecting oversized payloads | ||
| 139 | early in `decode_payload` (before base64 decoding) to prevent denial-of-service. Decryption may require | ||
| 140 | several times the payload size in working memory due to base64 decoding, byte array slicing, and | ||
| 141 | padding operations. For reference, JVM-based systems are limited to ~2 GB contiguous arrays, and mobile | ||
| 142 | devices may have significantly less available memory. Note that `calc_padded_len` can return values up | ||
| 143 | to 2^32, which exceeds the range of unsigned 32-bit integers; implementations must use 64-bit (or | ||
| 144 | larger) arithmetic for padding calculations. | ||
| 145 | |||
| 135 | ### Details | 146 | ### Details |
| 136 | 147 | ||
| 137 | - Cryptographic methods | 148 | - Cryptographic methods |
| @@ -152,6 +163,9 @@ validation rules, refer to BIP-340. | |||
| 152 | - `x[i:j]`, where `x` is a byte array and `i, j <= 0` returns a `(j - i)`-byte array with a copy of the | 163 | - `x[i:j]`, where `x` is a byte array and `i, j <= 0` returns a `(j - i)`-byte array with a copy of the |
| 153 | `i`-th byte (inclusive) to the `j`-th byte (exclusive) of `x`. | 164 | `i`-th byte (inclusive) to the `j`-th byte (exclusive) of `x`. |
| 154 | - Constants `c`: | 165 | - Constants `c`: |
| 166 | - `min_plaintext_size` is 1. 1 byte msg is padded to 32 bytes. | ||
| 167 | - `max_plaintext_size` is 4294967295 (2^32 - 1). | ||
| 168 | - `extended_prefix_threshold` is 65536. Lengths below this use a 2-byte u16 prefix; lengths at or above use a 6-byte prefix (2 zero bytes + u32). | ||
| 155 | - Functions | 169 | - Functions |
| 156 | - `base64_encode(string)` and `base64_decode(bytes)` are Base64 ([RFC 4648](https://datatracker.ietf.org/doc/html/rfc4648), with padding) | 170 | - `base64_encode(string)` and `base64_decode(bytes)` are Base64 ([RFC 4648](https://datatracker.ietf.org/doc/html/rfc4648), with padding) |
| 157 | - `concat` refers to byte array concatenation | 171 | - `concat` refers to byte array concatenation |
| @@ -159,6 +173,9 @@ validation rules, refer to BIP-340. | |||
| 159 | - `utf8_encode(string)` and `utf8_decode(bytes)` transform string to byte array and back | 173 | - `utf8_encode(string)` and `utf8_decode(bytes)` transform string to byte array and back |
| 160 | - `write_u8(number)` restricts number to values 0..255 and encodes into Big-Endian uint8 byte array | 174 | - `write_u8(number)` restricts number to values 0..255 and encodes into Big-Endian uint8 byte array |
| 161 | - `write_u16_be(number)` restricts number to values 0..65535 and encodes into Big-Endian uint16 byte array | 175 | - `write_u16_be(number)` restricts number to values 0..65535 and encodes into Big-Endian uint16 byte array |
| 176 | - `write_u32_be(number)` restricts number to values 0..4294967295 and encodes into Big-Endian uint32 byte array | ||
| 177 | - `read_uint16_be(bytes)` reads 2 bytes as a Big-Endian unsigned 16-bit integer | ||
| 178 | - `read_uint32_be(bytes)` reads 4 bytes as a Big-Endian unsigned 32-bit integer | ||
| 162 | - `zeros(length)` creates byte array of length `length >= 0`, filled with zeros | 179 | - `zeros(length)` creates byte array of length `length >= 0`, filled with zeros |
| 163 | - `floor(number)` and `log2(number)` are well-known mathematical methods | 180 | - `floor(number)` and `log2(number)` are well-known mathematical methods |
| 164 | 181 | ||
| @@ -178,51 +195,51 @@ def calc_padded_len(unpadded_len): | |||
| 178 | if unpadded_len <= 32: | 195 | if unpadded_len <= 32: |
| 179 | return 32 | 196 | return 32 |
| 180 | else: | 197 | else: |
| 181 | return chunk * (floor((len - 1) / chunk) + 1) | 198 | return chunk * (floor((unpadded_len - 1) / chunk) + 1) |
| 182 | 199 | ||
| 183 | # Converts unpadded plaintext to padded bytearray | 200 | # Converts unpadded plaintext to padded bytearray |
| 184 | def pad(plaintext): | 201 | def pad(plaintext): |
| 185 | unpadded = utf8_encode(plaintext) | 202 | unpadded = utf8_encode(plaintext) |
| 186 | unpadded_len = len(plaintext) | 203 | unpadded_len = len(unpadded) |
| 187 | if (unpadded_len < 1 or | 204 | if (unpadded_len < c.min_plaintext_size or |
| 188 | unpadded_len > 4294967295): raise Exception('invalid plaintext length') | 205 | unpadded_len > c.max_plaintext_size): raise Exception('invalid plaintext length') |
| 189 | if unpadded_len > 65536: | 206 | if unpadded_len >= c.extended_prefix_threshold: |
| 190 | prefix = concat( | 207 | prefix = concat([0, 0], write_u32_be(unpadded_len)) # 6 bytes |
| 191 | [0, 0], | ||
| 192 | write_u32_be(unpadded_len), | ||
| 193 | ) | ||
| 194 | else: | 208 | else: |
| 195 | prefix = write_u16_be(unpadded_len) | 209 | prefix = write_u16_be(unpadded_len) # 2 bytes |
| 196 | suffix = zeros(calc_padded_len(unpadded_len) - unpadded_len) | 210 | suffix = zeros(calc_padded_len(unpadded_len) - unpadded_len) |
| 197 | return concat(prefix, unpadded, suffix) | 211 | return concat(prefix, unpadded, suffix) |
| 198 | 212 | ||
| 199 | # Converts padded bytearray to unpadded plaintext | 213 | # Converts padded bytearray to unpadded plaintext |
| 200 | def unpad(padded): | 214 | def unpad(padded): |
| 201 | unpadded_len = read_uint16_be(padded[0:2]) | 215 | first_two = read_uint16_be(padded[0:2]) |
| 202 | if unpadded_len == 0: | 216 | if first_two == 0: |
| 203 | unpadded_len = read_uint32_be(padded[2:6]) | 217 | unpadded_len = read_uint32_be(padded[2:6]) |
| 204 | unpadded = padded[6:6+unpadded_len] | 218 | if unpadded_len < c.extended_prefix_threshold: raise Exception('invalid padding') |
| 219 | prefix_len = 6 | ||
| 205 | else: | 220 | else: |
| 206 | unpadded = padded[2:2+unpadded_len] | 221 | unpadded_len = first_two |
| 207 | 222 | prefix_len = 2 | |
| 223 | unpadded = padded[prefix_len:prefix_len+unpadded_len] | ||
| 208 | if (unpadded_len == 0 or | 224 | if (unpadded_len == 0 or |
| 209 | len(unpadded) != unpadded_len or | 225 | len(unpadded) != unpadded_len or |
| 210 | len(padded) != 2 + calc_padded_len(unpadded_len)): raise Exception('invalid padding') | 226 | len(padded) != prefix_len + calc_padded_len(unpadded_len)): raise Exception('invalid padding') |
| 211 | return utf8_decode(unpadded) | 227 | return utf8_decode(unpadded) |
| 212 | 228 | ||
| 213 | # metadata: always 65b (version: 1b, nonce: 32b, max: 32b) | 229 | # metadata: always 65b (version: 1b, nonce: 32b, mac: 32b) |
| 214 | # plaintext: 1b to 0xffff | 230 | # plaintext: 1b to 0xffffffff |
| 215 | # padded plaintext: 32b to 0xffff | 231 | # padded plaintext (small, <65536): 32b to 0x10000, with 2b prefix -> 34b to 0x10000+2 |
| 216 | # ciphertext: 32b+2 to 0xffff+2 | 232 | # padded plaintext (large, >=65536): 0x10000 to 0x100000000, with 6b prefix -> 0x10006 to 0x100000000+6 |
| 217 | # raw payload: 99 (65+32+2) to 65603 (65+0xffff+2) | 233 | # ciphertext: same as padded plaintext (chacha20 doesn't change length) |
| 218 | # compressed payload (base64): 132b to 87472b | 234 | # raw payload (small): 99 (65+34) to 65603 (65+0x10000+2) |
| 235 | # raw payload (large): 65607 (65+0x10006) to 4294967367 (65+0x100000000+6) | ||
| 219 | def decode_payload(payload): | 236 | def decode_payload(payload): |
| 220 | plen = len(payload) | 237 | plen = len(payload) |
| 221 | if plen == 0 or payload[0] == '#': raise Exception('unknown version') | 238 | if plen == 0 or payload[0] == '#': raise Exception('unknown version') |
| 222 | if plen < 132 or plen > 87472: raise Exception('invalid payload size') | 239 | if plen < 132: raise Exception('invalid payload size') |
| 223 | data = base64_decode(payload) | 240 | data = base64_decode(payload) |
| 224 | dlen = len(d) | 241 | dlen = len(data) |
| 225 | if dlen < 99 or dlen > 65603: raise Exception('invalid data size'); | 242 | if dlen < 99: raise Exception('invalid data size'); |
| 226 | vers = data[0] | 243 | vers = data[0] |
| 227 | if vers != 2: raise Exception('unknown version ' + vers) | 244 | if vers != 2: raise Exception('unknown version ' + vers) |
| 228 | nonce = data[1:33] | 245 | nonce = data[1:33] |
| @@ -308,3 +325,27 @@ The file also contains intermediate values. A quick guidance with regards to its | |||
| 308 | - `invalid.encrypt_msg_lengths` | 325 | - `invalid.encrypt_msg_lengths` |
| 309 | - `invalid.get_conversation_key`: calculating conversation_key must throw an error | 326 | - `invalid.get_conversation_key`: calculating conversation_key must throw an error |
| 310 | - `invalid.decrypt`: decrypting message content must throw an error | 327 | - `invalid.decrypt`: decrypting message content must throw an error |
| 328 | |||
| 329 | #### Extended length prefix test vectors | ||
| 330 | |||
| 331 | The following test vectors exercise the boundary between the 2-byte u16 prefix and the 6-byte | ||
| 332 | extended prefix. Since the payloads are too large to include inline, SHA-256 checksums of the | ||
| 333 | plaintext and base64-encoded payload are provided (following the `encrypt_decrypt_long_msg` pattern). | ||
| 334 | |||
| 335 | All vectors use the same `conversation_key` and `nonce` as above. Plaintext is the byte `0x61` | ||
| 336 | (`'a'`) repeated to the specified length. | ||
| 337 | |||
| 338 | ``` | ||
| 339 | conversation_key: c41c775356fd92eadc63ff5a0dc1da211b268cbea22316767095b2871ea1412d | ||
| 340 | nonce: 0000000000000000000000000000000000000000000000000000000000000001 | ||
| 341 | ``` | ||
| 342 | |||
| 343 | | plaintext_len | prefix | padded_len | plaintext_sha256 | payload_sha256 | | ||
| 344 | |---|---|---|---|---| | ||
| 345 | | 65535 | u16 (2 bytes) | 65536 | `6e1bebca6a8229364a162a72ef064826c4cd7457bf54f190ef782bd9deff3e42` | `6d8c2810d1e870fbaa1f0a0937126cca837a15f9260e27060c331d70a3c0bc84` | | ||
| 346 | | 65536 | extended (6 bytes) | 65536 | `bf718b6f653bebc184e1479f1935b8da974d701b893afcf49e701f3e2f9f9c5a` | `b7b4edb36ba92e267d322d56d9aebc22e7fa96ff52e3c12adc07f07a43cbc616` | | ||
| 347 | | 65537 | extended (6 bytes) | 81920 | `008ffc88d3c96a9f307524eb361e47c5222a887fc45fa0c1fb8d429c5c23b430` | `eeb7c7c5373894ea2c1547cfd3ccb15d5a0b2d619da852e5c79df792dcc9e435` | | ||
| 348 | |||
| 349 | Note that 65535 and 65536 both have a `padded_len` of 65536, but the total padded-with-prefix | ||
| 350 | sizes differ: 65538 (2 + 65536) vs 65542 (6 + 65536). The jump to 65537 triggers the next | ||
| 351 | padding bucket at 81920. | ||