-
Notifications
You must be signed in to change notification settings - Fork 84
/
270-newhope-hybrid-handshake.txt
764 lines (595 loc) · 30.9 KB
/
270-newhope-hybrid-handshake.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
Filename: 270-newhope-hybrid-handshake.txt
Title: RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope
Author: Isis Lovecruft, Peter Schwabe
Created: 16 Apr 2016
Updated: 22 Jul 2016
Status: Obsolete
Depends: prop#220 prop#249 prop#264 prop#270
§0. Introduction
RebelAlliance is a post-quantum secure hybrid handshake, comprised of an
alliance between the X25519 and NewHope key exchanges.
NewHope is a post-quantum-secure lattice-based key-exchange protocol based
on the ring-learning-with-errors (Ring-LWE) problem. We propose a hybrid
handshake for Tor, based on a combination of Tor's current NTor handshake
and a shared key derived through a NewHope ephemeral key exchange.
For further details on the NewHope key exchange, the reader is referred to
"Post-quantum key exchange - a new hope" by Alkim, Ducas, Pöppelmann, and
Schwabe [0][1].
For the purposes of brevity, we consider that NTor is currently the only
handshake protocol in Tor; the older TAP protocol is ignored completely, due
to the fact that it is currently deprecated and nearly entirely unused.
§1. Motivation
An attacker currently monitoring and storing circuit-layer NTor handshakes
who later has the ability to run Shor's algorithm on a quantum computer will
be able to break Tor's current handshake protocol and decrypt previous
communications.
It is unclear if and when such attackers equipped with large quantum
computers will exist, but various estimates by researchers in quantum
physics and quantum engineering give estimates of only 1 to 2 decades.
Clearly, the security requirements of many Tor users include secrecy of
their messages beyond this time span, which means that Tor needs to update
the key exchange to protect against such attackers as soon as possible.
§2. Design
An initiator and responder, in parallel, conduct two handshakes:
- An X25519 key exchange, as described in the description of the NTor
handshake in Tor proposal #216.
- A NewHope key exchange.
The shared keys derived from these two handshakes are then concatenated and
used as input to the SHAKE-256 extendable output function (XOF), as described
in FIPS-PUB-202 [2], in order to produce a shared key of the desired length.
The testvectors in §C assume that this key has a length of 32 bytes, but the
use of a XOF allows arbitrary lengths to easily support future updates of
the symmetric primitives using the key. See also §3.3.1.
§3. Specification
§3.1. Notation
Let `a || b` be the concatenation of a with b.
Let `a^b` denote the exponentiation of a to the bth power.
Let `a == b` denote the equality of a with b, and vice versa.
Let `a := b` be the assignment of the value of b to the variable a.
Let `H(x)` be 32-bytes of output of the SHAKE-256 XOF (as described in
FIPS-PUB-202) applied to message x.
Let X25519 refer to the curve25519-based key agreement protocol described
in RFC7748 §6.1. [3]
Let `EXP(a, b) == X25519(., b, a)` with `g == 9`. Let X25519_KEYGEN() do
the appropriate manipulations when generating the secret key (clearing the
low bits, twidding the high bits). Additionally, EXP() MUST include the
check for all-zero output due to the input point being of small
order (cf. RFC7748 §6).
Let `X25519_KEYID(B) == B` where B is a valid X25519 public key.
When representing an element of the Curve25519 subgroup as a byte string,
use the standard (32-byte, little-endian, x-coordinate-only) representation
for Curve25519 points.
Let `ID` be a router's identity key taken from the router microdescriptor.
In the case for relays possessing Ed25519 identity keys (cf. Tor proposal
#220), this is a 32-byte string representing the public Ed25519 identity key.
For backwards and forwards compatibility with routers which do not possess
Ed25519 identity keys, this is a 32-byte string created via the output of
H(ID).
We refer to the router as the handshake "responder", and the client (which
may be an OR or an OP) as the "initiator".
ID_LENGTH [32 bytes]
H_LENGTH [32 bytes]
G_LENGTH [32 bytes]
PROTOID := "pqtor-x25519-newhope-shake256-1"
T_MAC := PROTOID || ":mac"
T_KEY := PROTOID || ":key_extract"
T_VERIFY := PROTOID || ":verify"
(X25519_SK, X25519_PK) := X25519_KEYGEN()
§3.2. Protocol
========================================================================================
| |
| Fig. 1: The NewHope-X25519 Hybrid Handshake. |
| |
| Before the handshake the Initiator is assumed to know Z, a public X25519 key for |
| the Responder, as well as the Responder's ID. |
----------------------------------------------------------------------------------------
| |
| Initiator Responder |
| |
| SEED := H(randombytes(32)) |
| x, X := X25519_KEYGEN() |
| a, A := NEWHOPE_KEYGEN(SEED) |
| CLIENT_HDATA := ID || Z || X || A |
| |
| --- CLIENT_HDATA ---> |
| |
| y, Y := X25519_KEYGEN() |
| NTOR_KEY, AUTH := NTOR_SHAREDB(X,y,Y,z,Z,ID,B) |
| M, NEWHOPE_KEY := NEWHOPE_SHAREDB(A) |
| SERVER_HDATA := Y || AUTH || M |
| sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) |
| |
| <-- SERVER_HDATA ---- |
| |
| NTOR_KEY := NTOR_SHAREDA(x, X, Y, Z, ID, AUTH) |
| NEWHOPE_KEY := NEWHOPE_SHAREDA(M, a) |
| sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) |
| |
========================================================================================
§3.2.1. The NTor Handshake
§3.2.1.1. Prologue
Take a router with identity ID. As setup, the router generates a secret key z,
and a public onion key Z with:
z, Z := X25519_KEYGEN()
The router publishes Z in its server descriptor in the "ntor-onion-key" entry.
Henceforward, we refer to this router as the "responder".
§3.2.1.2. Initiator
To send a create cell, the initiator generates a keypair:
x, X := X25519_KEYGEN()
and creates the NTor portion of a CREATE2V cell's HDATA section:
CLIENT_NTOR := ID || Z || X [96 bytes]
The initiator includes the responder's ID and Z in the CLIENT_NTOR so that, in
the event the responder OR has recently rotated keys, the responder can
determine which keypair to use.
The initiator then concatenates CLIENT_NTOR with CLIENT_NEWHOPE (see §3.2.2),
to create CLIENT_HDATA, and creates and sends a CREATE2V cell (see §A.1)
to the responder.
CLIENT_NEWHOPE [1824 bytes] (see §3.2.2)
CLIENT_HDATA := CLIENT_NTOR || CLIENT_NEWHOPE [1920 bytes]
If the responder does not respond with a CREATED2V cell, the initiator SHOULD
NOT attempt to extend the circuit through the responder by sending fragmented
EXTEND2 cells, since the responder's lack of support for CREATE2V cells is
assumed to imply the responder also lacks support for fragmented EXTEND2
cells. Alternatively, for initiators with a sufficiently late consensus
method, the initiator MUST check that "proto" line in the responder's
descriptor (cf. Tor proposal #264) advertises support for the "Relay"
subprotocol version 3 (see §5).
§3.2.1.3. Responder
The responder generates a keypair of y, Y = X25519_KEYGEN(), and does
NTOR_SHAREDB() as follows:
(NTOR_KEY, AUTH) ← NTOR_SHAREDB(X, y, Y, z, Z, ID, B):
secret_input := EXP(X, y) || EXP(X, z) || ID || B || Z || Y || PROTOID
NTOR_KEY := H(secret_input, T_KEY)
verify := H(secret_input, T_VERIFY)
auth_input := verify || ID || Z || Y || X || PROTOID || "Server"
AUTH := H(auth_input, T_MAC)
The responder sends a CREATED2V cell containing:
SERVER_NTOR := Y || AUTH [64 bytes]
SERVER_NEWHOPE [2048 bytes] (see §3.2.2)
SERVER_HDATA := SERVER_NTOR || SERVER_NEWHOPE [2112 bytes]
and sends this to the initiator.
§3.2.1.4. Finalisation
The initiator then checks Y is in G^* [see NOTE below], and does
NTOR_SHAREDA() as follows:
(NTOR_KEY) ← NTOR_SHAREDA(x, X, Y, Z, ID, AUTH)
secret_input := EXP(Y, x) || EXP(Z, x) || ID || Z || X || Y || PROTOID
NTOR_KEY := H(secret_input, T_KEY)
verify := H(secret_input, T_VERIFY)
auth_input := verify || ID || Z || Y || X || PROTOID || "Server"
if AUTH == H(auth_input, T_MAC)
return NTOR_KEY
Both parties now have a shared value for NTOR_KEY. They expand this into
the keys needed for the Tor relay protocol.
[XXX We think we want to omit the final hashing in the production of NTOR_KEY
here, and instead put all the inputs through SHAKE-256. --isis, peter]
[XXX We probably want to remove ID and B from the input to the shared key
material, since they serve for authentication but, as pre-established
"prologue" material to the handshake, they should not be used in attempts to
strengthen the cryptographic suitability of the shared key. Also, their
inclusion is implicit in the DH exponentiations. I should probably ask Ian
about the reasoning for the original design choice. --isis]
§3.2.2. The NewHope Handshake
§3.2.2.1. Parameters & Mathematical Structures
Let ℤ be the ring of rational integers. Let ℤq, for q ≥ 1, denote the quotient
ring ℤ/qℤ. We define R = ℤ[X]/((X^n)+1) as the ring of integer polynomials
modulo ((X^n)+1), and Rq = ℤq[X]/((X^n)+1) as the ring of integer polynomials
modulo ((X^n)+1) where each coefficient is reduced modulo q. When we refer to
a polynomial, we mean an element of Rq.
n := 1024
q := 12289
SEED [32 Bytes]
NEWHOPE_POLY [1792 Bytes]
NEWHOPE_REC [256 Bytes]
NEWHOPE_KEY [32 Bytes]
NEWHOPE_MSGA := (NEWHOPE_POLY || SEED)
NEWHOPE_MSGB := (NEWHOPE_POLY || NEWHOPE_REC)
§3.2.2.2. High-level Description of Newhope API Functions
For a description of internal functions, see §B.
(NEWHOPE_POLY, NEWHOPE_MSGA) ← NEWHOPE_KEYGEN(SEED):
â := gen_a(seed)
s := poly_getnoise()
e := poly_getnoise()
ŝ := poly_ntt(s)
ê := poly_ntt(e)
b̂ := pointwise(â, ŝ) + ê
sp := poly_tobytes(ŝ)
bp := poly_tobytes(b̂)
return (sp, (bp || seed))
(NEWHOPE_MSGB, NEWHOPE_KEY) ← NEWHOPE_SHAREDB(NEWHOPE_MSGA):
s' := poly_getnoise()
e' := poly_getnoise()
e" := poly_getnoise()
b̂ := poly_frombytes(bp)
â := gen_a(seed)
ŝ' := poly_ntt(s')
ê' := poly_ntt(e')
û := poly_pointwise(â, ŝ') + ê'
v := poly_invntt(poly_pointwise(b̂,ŝ')) + e"
r := helprec(v)
up := poly_tobytes(û)
k := rec(v, r)
return ((up || r), k)
NEWHOPE_KEY ← NEWHOPE_SHAREDA(NEWHOPE_MSGB, NEWHOPE_POLY):
û := poly_frombytes(up)
ŝ := poly_frombytes(sp)
v' := poly_invntt(poly_pointwise(û, ŝ))
k := rec(v', r)
return k
When a client uses a SEED within a CREATE2V cell, the client SHOULD NOT use
that SEED in any other CREATE2V or EXTEND2 cells. See §4 for further
discussion.
§3.3. Key Expansion
The client and server derive a shared key, SHARED, by:
HKDFID := "THESE ARENT THE DROIDS YOURE LOOKING FOR"
SHARED := SHAKE_256(HKDFID || NTorKey || NewHopeKey)
§3.3.1. Note on the Design Choice
The reader may wonder why one would use SHAKE-256 to produce a 256-bit
output, since the security strength in bits for SHAKE-256 is min(d/2,256)
for collision resistance and min(d,256) for first- and second-order
preimages, where d is the output length.
The reasoning is that we should be aiming for 256-bit security for all of
our symmetric cryptography. One could then argue that we should just use
SHA3-256 for the KDF. We choose SHAKE-256 instead in order to provide an
easy way to derive longer shared secrets in the future without requiring a
new handshake. The construction is odd, but the future is bright.
As we are already using SHAKE-256 for the 32-byte output hash, we are also
using it for all other 32-byte hashes involved in the protocol. Note that
the only difference between SHA3-256 and SHAKE-256 with 32-byte output is
one domain-separation byte.
[XXX why would you want 256-bit security for the symmetric side? Are you
talking pre- or post-quantum security? --peter]
§4. Security & Anonymity Implications
This handshake protocol is one-way authenticated. That is, the server is
authenticated, while the client remains anonymous.
The client MUST NOT cache and reuse SEED. Doing so gives non-trivial
adversarial advantages w.r.t. all-for-the-price-of-one attacks during the
caching period. More importantly, if the SEED used to generate NEWHOPE_MSGA
is reused for handshakes along the same circuit or multiple different
circuits, an adversary conducting a sybil attack somewhere along the path(s)
will be able to correlate the identity of the client across circuits or
hops.
§5. Compatibility
Because our proposal requires both the client and server to send more than
the 505 bytes possible within a CREATE2 cell's HDATA section, it depends
upon the implementation of a mechanism for allowing larger CREATE cells
(cf. Tor proposal #249).
We reserve the following handshake type for use in CREATE2V/CREATED2V and
EXTEND2V/EXTENDED2V cells:
0x0003 [NEWHOPE + X25519 HYBRID HANDSHAKE]
We introduce a new sub-protocol number, "Relay=3", (cf. Tor proposal #264
§5.3) to signify support this handshake, and hence for the CREATE2V and
fragmented EXTEND2 cells which it requires.
There are no additional entries or changes required within either router
descriptors or microdescriptors to support this handshake method, due to the
NewHope keys being ephemeral and derived on-the-fly, and due to the NTor X25519
public keys already being included within the "ntor-onion-key" entry.
Add a "UseNewHopeKEX" configuration option and a corresponding consensus
parameter to control whether clients prefer using this NewHope hybrid
handshake or some previous handshake protocol. If the configuration option
is "auto", clients SHOULD obey the consensus parameter. The default
configuration SHOULD be "auto" and the consensus value SHOULD initially be "0".
§6. Implementation
The paper by Alkim, Ducas, Pöppelmann and Schwabe describes two software
implementations of NewHope, one C reference implementation and an optimized
implementation using AVX2 vector instructions. Those implementations are
available at [1].
Additionally, there are implementations in Go by Yawning Angel, available
from [4] and in Rust by Isis Lovecruft, available from [5].
The software used to generate the test vectors in §C is based on the C
reference implementation and available from:
https://code.ciph.re/isis/newhope-tor-testvectors
https://github.com/isislovecruft/newhope-tor-testvectors
§7. Performance & Scalability
The computationally expensive part in the current NTor handshake is the
X25519 key-pair generation and the X25519 shared-key computation. The
current implementation in Tor is a wrapper to support various highly optimized
implementations on different architectures. On Intel Haswell processors, the
fastest implementation of X25519, as reported by the eBACS benchmarking
project [6], takes 169920 cycles for key-pair generation and 161648 cycles
for shared-key computation; these add up to a total of 331568 cycles on each
side (initiator and responder).
The C reference implementation of NewHope, also benchmarked on Intel
Haswell, takes 358234 cycles for the initiator and 402058 cycles for the
Responder. The core computation of the proposed combination of NewHope and
X25519 will thus mean a slowdown of about a factor of 2.1 for the Initiator
and a slowdown by a factor of 2.2 for the Responder compared to the current
NTor handshake. These numbers assume a fully optimized implementation of the
NTor handshake and a C reference implementation of NewHope. With optimized
implementations of NewHope, such as the one for Intel Haswell described in
[0], the computational slowdown will be considerably smaller than a factor
of 2.
§8. References
[0]: https://cryptojedi.org/papers/newhope-20160328.pdf
[1]: https://cryptojedi.org/crypto/#newhope
[2]: http://www.nist.gov/customcf/get_pdf.cfm?pub_id=919061
[3]: https://tools.ietf.org/html/rfc7748#section-6.1
[4]: https://github.com/Yawning/newhope
[5]: https://code.ciph.re/isis/newhopers
[6]: http://bench.cr.yp.to
§A. Cell Formats
§A.1. CREATE2V Cells
The client portion of the handshake should send CLIENT_HDATA, formatted
into a CREATE2V cell as follows:
CREATE2V { [2114 bytes]
HTYPE := 0x0003 [2 bytes]
HLEN := 0x0780 [2 bytes]
HDATA := CLIENT_HDATA [1920 bytes]
IGNORED := 0x00 [194 bytes]
}
[XXX do we really want to pad with IGNORED to make CLIENT_HDATA the
same number of bytes as SERVER_HDATA? --isis]
§A.2. CREATED2V Cells
The server responds to the client's CREATE2V cell with SERVER_HDATA,
formatted into a CREATED2V cell as follows:
CREATED2V { [2114 bytes]
HLEN := 0x0800 [2 bytes]
HDATA := SERVER_HDATA [2112 bytes]
IGNORED := 0x00 [0 bytes]
}
§A.3. Fragmented EXTEND2 Cells
When the client wishes to extend a circuit, the client should fragment
CLIENT_HDATA into four EXTEND2 cells:
EXTEND2 {
NSPEC := 0x02 { [1 byte]
LINK_ID_SERVER [22 bytes] XXX
LINK_ADDRESS_SERVER [8 bytes] XXX
}
HTYPE := 0x0003 [2 bytes]
HLEN := 0x0780 [2 bytes]
HDATA := CLIENT_HDATA[0,461] [462 bytes]
}
EXTEND2 {
NSPEC := 0x00 [1 byte]
HTYPE := 0xFFFF [2 bytes]
HLEN := 0x0000 [2 bytes]
HDATA := CLIENT_HDATA[462,954] [492 bytes]
}
EXTEND2 {
NSPEC := 0x00 [1 byte]
HTYPE := 0xFFFF [2 bytes]
HLEN := 0x0000 [2 bytes]
HDATA := CLIENT_HDATA[955,1447] [492 bytes]
}
EXTEND2 {
NSPEC := 0x00 [1 byte]
HTYPE := 0xFFFF [2 bytes]
HLEN := 0x0000 [2 bytes]
HDATA := CLIENT_HDATA[1448,1919] || 0x00[20] [492 bytes]
}
EXTEND2 {
NSPEC := 0x00 [1 byte]
HTYPE := 0xFFFF [2 bytes]
HLEN := 0x0000 [2 bytes]
HDATA := 0x00[172] [172 bytes]
}
The client sends this to the server to extend the circuit from, and that
server should format the fragmented EXTEND2 cells into a CREATE2V cell, as
described in §A.1.
§A.4. Fragmented EXTENDED2 Cells
EXTENDED2 {
NSPEC := 0x02 { [1 byte]
LINK_ID_SERVER [22 bytes] XXX
LINK_ADDRESS_SERVER [8 bytes] XXX
}
HTYPE := 0x0003 [2 bytes]
HLEN := 0x0800 [2 bytes]
HDATA := SERVER_HDATA[0,461] [462 bytes]
}
EXTENDED2 {
NSPEC := 0x00 [1 byte]
HTYPE := 0xFFFF [2 bytes]
HLEN := 0x0000 [2 bytes]
HDATA := SERVER_HDATA[462,954] [492 bytes]
}
EXTEND2 {
NSPEC := 0x00 [1 byte]
HTYPE := 0xFFFF [2 bytes]
HLEN := 0x0000 [2 bytes]
HDATA := SERVER_HDATA[955,1447] [492 bytes]
}
EXTEND2 {
NSPEC := 0x00 [1 byte]
HTYPE := 0xFFFF [2 bytes]
HLEN := 0x0000 [2 bytes]
HDATA := SERVER_HDATA[1448,1939] [492 bytes]
}
EXTEND2 {
NSPEC := 0x00 [1 byte]
HTYPE := 0xFFFF [2 bytes]
HLEN := 0x0000 [2 bytes]
HDATA := SERVER_HDATA[1940,2112] [172 bytes]
}
§B. NewHope Internal Functions
gen_a(SEED): returns a uniformly random poly
poly_getnoise(): returns a poly sampled from a centered binomial
poly_ntt(poly): number-theoretic transform; returns a poly
poly_invntt(poly): inverse number-theoretic transform; returns a poly
poly_pointwise(poly, poly): pointwise multiplication; returns a poly
poly_tobytes(poly): packs a poly to a NEWHOPE_POLY byte array
poly_frombytes(NEWHOPE_POLY): unpacks a NEWHOPE_POLY byte array to a poly
helprec(poly): returns a NEWHOPE_REC byte array
rec(poly, NEWHOPE_REC): returns a NEWHOPE_KEY
--- Description of the Newhope internal functions ---
gen_a(SEED seed) receives as input a 32-byte (public) seed. It expands
this seed through SHAKE-128 from the FIPS202 standard. The output of SHAKE-128
is considered a sequence of 16-bit little-endian integers. This sequence is
used to initialize the coefficients of the returned polynomial from the least
significant (coefficient of X^0) to the most significant (coefficient of
X^1023) coefficient. For each of the 16-bit integers first eliminate the
highest two bits (to make it a 14-bit integer) and then use it as the next
coefficient if it is smaller than q=12289.
Note that the amount of output required from SHAKE to initialize all 1024
coefficients of the polynomial varies depending on the input seed.
Note further that this function does not process any secret data and thus does
not need any timing-attack protection.
poly_getnoise() first generates 4096 bytes of uniformly random data. This can
be done by reading these bytes from the system's RNG; efficient
implementations will typically only read a 32-byte seed from the system's RNG
and expand it through some fast PRG (for example, ChaCha20 or AES-256 in CTR
mode). The output of the PRG is considered an array of 2048 16-bit integers
r[0],...,r[2047]. The coefficients of the output polynomial are computed as
HW(r[0])-HW(r[1]), HW(r[2])-HW(r[3]),...,HW(r[2046])-HW(r[2047]), where HW
stands for Hamming weight.
Note that the choice of RNG is a local decision; different implementations are
free to use different RNGs.
Note further that the output of this function is secret; the PRG (and the
computation of HW) need to be protected against timing attacks.
poly_ntt(poly f): For a mathematical description of poly_ntt see the [0]; a
pseudocode description of a very naive in-place transformation of an input
polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the
following code (all arithmetic on coefficients performed modulo q):
psi = 7
omega = 49
for i in range(0,n):
t[i] = f[i] * psi^i
for i in range(0,n):
f[i] = 0
for j in range(0,n):
f[i] += t[j] * omega^((i*j)%n)
Note that this is not how poly_ntt should be implemented if performance is
an issue; in particular, efficient algorithms for the number-theoretic
transform take time O(n*log(n)) and not O(n^2)
Note further that all arithmetic in poly_ntt has to be protected against
timing attacks.
poly_invntt(poly f): For a mathematical description of poly_invntt see the
[0]; a pseudocode description of a very naive in-place transformation of an
input polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the
following code (all arithmetic on coefficients performed modulo q):
invpsi = 8778;
invomega = 1254;
invn = 12277;
for i in range(0,n):
t[i] = f[i];
for i in range(0,n):
f[i]=0;
for j in range(0,n):
f[i] += t[j] * invomega^((i*j)%n)
f[i] *= invpsi^i
f[i] *= invn
Note that this is not how poly_invntt should be implemented if performance
is an issue; in particular, efficient algorithms for the inverse
number-theoretic transform take time O(n*log(n)) and not O(n^2)
Note further that all arithmetic in poly_invntt has to be protected against
timing attacks.
poly_pointwise(poly f, poly g) performs pointwise multiplication of the two
polynomials. This means that for f = (f0 + f1*X + f2*X^2 + ... +
f1023*X^1023) and g = (g0 + g1*X + g2*X^2 + ... + g1023*X^1023) it computes
and returns h = (h0 + h1*X + h2*X^2 + ... + h1023*X^1023) with h0 = f0*g0,
h1 = f1*g1,..., h1023 = f1023*g1023.
poly_tobytes(poly f) first reduces all coefficents of f modulo q, i.e.,
brings them to the interval [0,q-1]. Denote these reduced coefficients as
f0,..., f1023; note that they all fit into 14 bits. The function then packs
those coefficients into an array of 1792 bytes r[0],..., r[1792] in "packed
little-endian representation", i.e.,
r[0] = f[0] & 0xff;
r[1] = (f[0] >> 8) & ((f[1] & 0x03) << 6)
r[2] = (f[1] >> 2) & 0xff;
r[3] = (f[1] >> 10) & ((f[2] & 0x0f) << 4)
.
.
.
r[1790] = (f[1022]) >> 12) & ((f[1023] & 0x3f) << 2)
r[1791] = f[1023] >> 6
Note that this function needs to be protected against timing attacks. In
particular, avoid non-constant-time conditional subtractions (or other
non-constant-time expressions) in the reduction modulo q of the coefficients.
poly_frombytes(NEWHOPE_POLY b) is the inverse of poly_tobytes; it receives
as input an array of 1792 bytes and coverts it into the internal
representation of a poly. Note that poly_frombytes does not need to check
whether the coefficients are reduced modulo q or reduce coefficients modulo
q. Note further that the function must not leak any information about its
inputs through timing information, as it is also applied to the secret key
of the initiator.
helprec(poly f) computes 256 bytes of reconciliation information from the
input poly f. Internally, one byte of reconciliation information is computed
from four coefficients of f by a function helprec4. Let the input polynomial f
= (f0 + f1*X + f2*X^2 + ... + f1023*X^1023); let the output byte array be
r[0],...r[256]. This output byte array is computed as
r[0] = helprec4(f0,f256,f512,f768)
r[1] = helprec4(f1,f257,f513,f769)
r[2] = helprec4(f2,f258,f514,f770)
.
.
.
r[255] = helprec4(f255,f511,f767,f1023), where helprec4 does the following:
helprec4(x0,x1,x2,x3):
b = randombit()
r0,r1,r2,r3 = CVPD4(8*x0+4*b,8*x1+4*b,8*x2+4*b,8*x3+4*b)
r = (r0 & 0x03) | ((r1 & 0x03) << 2) | ((r2 & 0x03) << 4) | ((r3 & 0x03) << 6)
return r
The function CVPD4 does the following:
CVPD4(y0,y1,y2,y3):
v00 = round(y0/2q)
v01 = round(y1/2q)
v02 = round(y2/2q)
v03 = round(y3/2q)
v10 = round((y0-1)/2q)
v11 = round((y1-1)/2q)
v12 = round((y2-1)/2q)
v13 = round((y3-1)/2q)
t = abs(y0 - 2q*v00)
t += abs(y1 - 2q*v01)
t += abs(y2 - 2q*v02)
t += abs(y3 - 2q*v03)
if(t < 2q):
v0 = v00
v1 = v01
v2 = v02
v3 = v03
k = 0
else
v0 = v10
v1 = v11
v2 = v12
v3 = v13
r = 1
return (v0-v3,v1-v3,v2-v3,k+2*v3)
In this description, round(x) is defined as ⌊x + 0.5⌋, where ⌊x⌋ rounds to
the largest integer that does not exceed x; abs() returns the absolute
value.
Note that all computations involved in helprec operate on secret data and must
be protected against timing attacks.
rec(poly f, NEWHOPE_REC r) computes the pre-hash (see paper) Newhope key from
f and r. Specifically, it computes one bit of key from 4 coefficients of f and
one byte of r. Let f = f0 + f1*X + f2*X^2 + ... + f1023*X^1023 and let r =
r[0],r[1],...,r[255]. Let the bytes of the output by k[0],...,k[31] and let
the bits of the output by k0,...,k255, where
k0 = k[0] & 0x01
k1 = (k[0] >> 1) & 0x01
k2 = (k[0] >> 2) & 0x01
.
.
.
k8 = k[1] & 0x01
k9 = (k[1] >> 1) & 0x01
.
.
.
k255 = (k[32] >> 7)
The function rec computes k0,...,k255 as
k0 = rec4(f0,f256,f512,f768,r[0])
k1 = rec4(f1,f257,f513,f769,r[1])
.
.
.
k255 = rec4(f255,f511,f767,f1023,r[255])
The function rec4 does the following:
rec4(y0,y1,y2,y3,r):
r0 = r & 0x03
r1 = (r >> 2) & 0x03
r2 = (r >> 4) & 0x03
r3 = (r >> 6) & 0x03
Decode(8*y0-2q*r0, 8*y1-2q*r1, 8*y2-2q*r2, 8*y3-q*r3)
The function Decode does the following:
Decode(v0,v1,v2,v3):
t0 = round(v0/8q)
t1 = round(v1/8q)
t2 = round(v2/8q)
t3 = round(v3/8q)
t = abs(v0 - 8q*t0)
t += abs(v0 - 8q*t0)
t += abs(v0 - 8q*t0)
t += abs(v0 - 8q*t0)
if(t > 1) return 1
else return 0
§C. Test Vectors