From: Loup Vaillant Date: Mon, 12 Dec 2022 14:31:04 +0000 (+0100) Subject: More portable/consistent EdDSA verification X-Git-Url: https://git.codecow.com/?a=commitdiff_plain;h=325da52cdec24fe1e6b316544792383d21353c32;p=Monocypher.git More portable/consistent EdDSA verification EdDSA has more corner cases than we would like. Up until now we didn't pay much attention. - The first version of Monocypher didn't check the range of S, allowing attackers to generate valid variants of existing signatures. While it doesn't affect the core properties of signatures, some systems rely on a stricter security guarantee: generating a new, distinct signature must require the private key. - When the public key has a low-order component, there can be an inconsistency between various verification methods. Detecting such keys is prohibitively expensive (a full scalar multiplication), and some systems nevertheless require that everyone agrees whether a signature is valid or not (if they don't we risk various failures such as network partitions). - Further disagreement can occur if A and R use a non-canonical encoding, though in practice this only happens when the public key has low order (and detecting _that_ is not expensive). There is a wide consensus that the range of S should be checked, and we do. Where consensus is lacking is with respect to the verification method (batch or strict equation), checking for non-canonical encodings, and checking that A has low order. The current version is as permissive as the consensus allows: - It checks the range of S. - It uses the batch equation. - It allows non-canonical encodings for A and R. - It allows A to have low order. The previous version on the other hand used the strict equation, and did not allow non-canonical encodings for R. The reasons for the current policy are as follows: - Everyone checks the range of S, it provides an additional security guarantee, and it makes verification slightly faster. - The batch equation is the only one that is consistent with batched verification. Batch verification is important because it allows up to 2x performance gains, precisely in settings where it might be the bottleneck (performing many verifications). - Allowing non-canonical encodings and low order A makes the code simpler, and makes sure we do not start rejecting signatures that were previously accepted. - Though these choices aren't completely RFC 8032 compliant, they _are_ consistent with at least one library out there (Zebra). Note that if we forbade low order A, we would be consistent with Libsodium instead. Which library we chose to be consistent with is kind of arbitrary. The main downside for now is an 8% drop in performance. 1% can be recovered by replacing the 3 final doublings by comparisons, but 7% come from R decompression, which is a necessary cost of the batch equation. I hope to overcome this loss with a lattice based optimisation [Thomas Pornin 2020]. Should mostly fix #248 --- diff --git a/src/monocypher.c b/src/monocypher.c index 035cb21..2864f1f 100644 --- a/src/monocypher.c +++ b/src/monocypher.c @@ -2002,38 +2002,41 @@ static int slide_step(slide_ctx *ctx, int width, int i, const u8 scalar[32]) int crypto_eddsa_check_equation(const u8 signature[64], const u8 public_key[32], const u8 h[32]) { - ge A; // -public_key + ge minus_A; // -public_key + ge minus_R; // -first_half_of_signature const u8 *s = signature + 32; - // Check that public_key is on the curve - // Compute A = -public_key - // Prevent s malleability + // Check that A and R are on the curve + // Check that 0 <= S < L (prevents malleability) + // *Allow* non-cannonical encoding for A and R { u32 s32[8]; load32_le_buf(s32, s, 8); - if (ge_frombytes_neg_vartime(&A, public_key) || is_above_l(s32)) { + if (ge_frombytes_neg_vartime(&minus_A, public_key) || + ge_frombytes_neg_vartime(&minus_R, signature) || + is_above_l(s32)) { return -1; } } - // look-up table for A + // look-up table for minus_A ge_cached lutA[P_W_SIZE]; { - ge A2, tmp; - ge_double(&A2, &A, &tmp); - ge_cache(&lutA[0], &A); + ge minus_A2, tmp; + ge_double(&minus_A2, &minus_A, &tmp); + ge_cache(&lutA[0], &minus_A); FOR (i, 1, P_W_SIZE) { - ge_add(&tmp, &A2, &lutA[i-1]); + ge_add(&tmp, &minus_A2, &lutA[i-1]); ge_cache(&lutA[i], &tmp); } } - // A = [s]B - [h]A + // sum = [s]B - [h]A // Merged double and add ladder, fused with sliding slide_ctx h_slide; slide_init(&h_slide, h); slide_ctx s_slide; slide_init(&s_slide, s); int i = MAX(h_slide.next_check, s_slide.next_check); - ge *sum = &A; + ge *sum = &minus_A; // reuse minus_A for the sum ge_zero(sum); while (i >= 0) { ge tmp; @@ -2048,10 +2051,19 @@ int crypto_eddsa_check_equation(const u8 signature[64], const u8 public_key[32], i--; } - // Compare R and A (originally [s]B - [h]A) - u8 r_check[32]; - ge_tobytes(r_check, &A); // r_check = A - return crypto_verify32(r_check, signature); // R == R_check ? OK : fail + // Compare [8](sum-R) and the zero point + // The multiplication by 8 eliminates any low-order component + // and ensures consistency with batched verification. + ge_cached cached; + u8 check[32]; + static const u8 zero_point[32] = {1}; // Point of order 1 + ge_cache(&cached, &minus_R); + ge_add(sum, sum, &cached); + ge_double(sum, sum, &minus_R); // reuse minus_R as temporary + ge_double(sum, sum, &minus_R); // reuse minus_R as temporary + ge_double(sum, sum, &minus_R); // reuse minus_R as temporary + ge_tobytes(check, sum); + return crypto_verify32(check, zero_point); } // 5-bit signed comb in cached format (Niels coordinates, Z=1)