Loup Vaillant [Fri, 2 Dec 2022 22:45:45 +0000 (23:45 +0100)]
Safer interface for EdDSA
Now the private key is 64 bytes, and is the concatenation of the seed
and the public key just like Libsodium. The idea is to make sure users
never sign messages with the wrong public key, which can leak the secret
scalar and allow forgeries.
Users who can't afford the overhead of storing 32 additional bytes for
the secret key (say they need to burn the key into expensive fuses),
they can always only store the first 32 bytes, and re-derive the entire
key pair when they need it.
Loup Vaillant [Thu, 1 Dec 2022 15:27:08 +0000 (16:27 +0100)]
Remove EdDSA incremental & custom hash API
The incremental and custom hash API was too complex and too niche to
justify itself. I'm removing them in favour of a more flexible
approach: giving the basic building blocks necessary to implement EdDSA
manually.
Those building blocks comprise 5 specialised functions:
- crypto_eddsa_trim_scalar: turn 32 random bytes into a proper scalar.
- crypto_eddsa_reduce : reduces a 64 bytes number modulo L.
- crypto_eddsa_mul_add : like MUL_ADD, except modulo L.
- crypto_eddsa_scalarbase : multiplies a scalar by the base point.
- crypto_eddsa_r_check : generates R independently for verification.
These make it fairly easy to implement EdDSA (including Ed25519) in
various ways, including the streaming and custom hash functions I just
removed, replacing the deterministic nonce by a random one, or adding a
random prefix to mitigate the energy side channel in some settings.
I believe only minimal tweaks are required to implement the Edwards25519
half of RFC 8032 entirely (including the context and pre-hash variants),
as well as XEdDSA (which should only require a single Montgomery to
Edwards conversion function).
This is a prototype, and the extensibility promises remain to be tested.
Ideally that means implementing all the fancy extensions in a separate
project, and _maybe_ include some of them in the optional files.
Loup Vaillant [Tue, 29 Nov 2022 23:49:15 +0000 (00:49 +0100)]
Switch indentation from spaces to tabs.
For the longest time I had a fairly strong personal preference for
spaces. Then it came to my attention that using tabs meaningfully
increases accessibility.
As a cryptography library, Monocypher is supposed to ultimately help,
among other people, the most vulnerable among us. It would be a shame
to potentially exclude disabled contributors or auditors.
Note that this patches sometimes changes a little more than just
spacing. A few pieces of code in particular relied on indentation
width, and had to be reworked a little bit to make them tab width
agnostic.
Loup Vaillant [Mon, 28 Nov 2022 22:58:53 +0000 (23:58 +0100)]
Remove deprecated functions.
Deprecated functions are redundant with previous major branches of
Monocypher, and as such don't serve any meaningful purpose. Maintaining
old branches is cheaper anyway.
Note: this will also remove them from the manual, including on the
website when updated. We could compensate by publishing older versions
of the manual, or we could punt on it and rely on the fact that the
tarball contain the associated manual.
- Global constant should have been `static`
- Reserved identifier (double underscores)
- Loss of precision in implicit conversions
- Implicit change of sign
Users may wonder why we didn't provide the safer API from the outset.
We could explain this is for backwards compatibility, but this man page
is quite cluttered already.
TODO: the next major version of Monocypher should definitely adopt this
safer API.
- clarify NULL goes in public_key (not secret_key)
- add parenthetical note to define the term "fat public key" inline
- fix commas (I actually had to look up the rules for comma-before-but)
- avoid colloquial "we"
Though I think the option of choosing how many bits we mask off at the
end is useful, it made the code more confusing. Better leave it for the
one case where we really need it: parsing Elligator representatives,
which are represented with one less bit than usual.
This does cost us 4 lines of code, but I think they're worth it. The
previous code was too much like code golf for my comfort.
Loup Vaillant [Sun, 13 Mar 2022 09:57:52 +0000 (10:57 +0100)]
Removed redundant Valgrind tests
I think we don't need to test both -O2 and -O3. We're checking memory
error here, they should not be much different under different
optimisation settings (though maybe -O3 can trigger more bugs).
Also did not add myself to the list of authors, I think this trivial
change is not copyrightable.
Chris Fogelklou [Sat, 5 Mar 2022 06:33:16 +0000 (07:33 +0100)]
Run basic tests taken from the readme.md file using Github Actions.
* Add a github workflow which follows the basic install flow in the readme.md.
* Add installation of libsodium.
* Add valgrind as a dependency.
* Add llvm to list of dependencies.
Fabio Scotoni [Sun, 13 Feb 2022 05:44:40 +0000 (06:44 +0100)]
Deprecate crypto_key_exchange
See #230.
In summary:
The way crypto_key_exchange() works was not optimal.
Several combinations of (private, public) key pairs produce the same
shared secret (see 54a7ef551078dda001f94f1045e142af0fc1008a);
crypto_key_exchange() did nothing to mitigate that.
Since this is a high-level function that cannot be saved even with
documentation because it is so fundamentally misguided,
it is now deprecated (see #232).
Users are encouraged to either adopt the function into their own code
if and only if they require it for backwards compatibility,
starting with the release that removes it,
or preferably migrate to implementing an existing protocol,
such as instantiating a Noise protocol pattern or a protocol like X3DH.
Loup Vaillant [Sat, 12 Feb 2022 10:13:22 +0000 (11:13 +0100)]
Fix invsqrt() boolean return
The invsqrt() function used to return true when its operand was a
non-zero square. Thing is, it was used to determine whether it was a
square, *period*. If the operand was zero, we had the wrong answer.
This introduced two "errors" visible by the end user:
* The all-zero X25519 point could not be mapped back to an Elligator 2
representative. In practice that point is never used, but that does
not comply with the original paper.
* The (0, 1) EdDSa public key could not be serialised, causing any
signature verification associated with it to fail. This one we really
don't care, because even though it is on the curve, that point has low
order, and legitimate public keys all have prime order.
Jens Alfke [Wed, 12 Jan 2022 19:36:44 +0000 (11:36 -0800)]
Allow headers to declare a C++ namespace
If the preprocessor symbol `MONOCYPHER_CPP_NAMESPACE` is defined,
the Monocypher headers will be wrapped in a C++ namespace instead of
in `extern "C"`. The name of the namespace is the value of the symbol.
This requires also compiling Monocypher's .c files as C++, either by:
- Using a compiler option like `--x c++`;
- Using an IDE setting, like the Identity & Type inspector in Xcode;
- Putting `#include "monocypher.c" in a C++ file
Loup Vaillant [Wed, 18 Aug 2021 19:09:11 +0000 (21:09 +0200)]
Inlined ge_r_check()
This saves 9 lines of code.
ge_r_check() was once separated to enhance readability, but it turns out
that inlining it saves more lines of code than expected. The code is
almost more readable now, though the comments are still needed.
MSVC doesn't like when we expand unsigned integers after a bitwise
negation. A fair distaste, except this time it's wrong. Writing
`(~x & 0xffffffff)` instead of just `~x` shows MSVC the error of its ways.
Also made a potentially lossy conversion to `int` explicit (and
explained in a comment why there is no actual loss).
The trick is to parse the input bytes into words at the last moment.
This simplifies incremental loading, and saves around 10 lines of code.
The performance impact is barely measurable.
Caugth by TIS-CI and the latest Clang's UBSan.
Incrementing a NULL pointer, even by a NULL offset, is not permitted.
This was caused by the removal of a conditional that exited early if the
message was empty.
The fix was to move the increment inside the alignment loop.
It may be tiny bit slower, but this was the slow path already.
Users can avoid it by aligning their increments to block boundaries.
Previous changes caused ge_msub() to only be used for signature
verification (it was previously used for signature generation as well,
but this hurted readability). It thus became reasonable to use
temporary buffers, since we no longer have to wipe them (at a sizeable
performance cost).
The trick is the same as how ge_sub() is defined in terms of ge_add().
This saves 9 lines of code, and the performance cost is negligible.
This saved 16 lines of code. It is not clear how readability is impacted
(the calling code is a bit more complex, but it is also more local).
Overall, I believe it's a win.
Fabio Scotoni [Fri, 11 Jun 2021 18:37:46 +0000 (20:37 +0200)]
doc: Use canonical spelling of BLAKE2(b)
Spelling according to Jean-Philippe Aumasson, Samuel Neves, Zooko
Wilcox-O’Hearn, Christian Winnerlein. "BLAKE2: Simpler, Smaller, Fast as
MD5." ACNS 2013. Lecture Notes in Computer Science, vol 7954,
and also https://twitter.com/veorq/status/1396728032883322884
Fabio Scotoni [Fri, 11 Jun 2021 18:33:40 +0000 (20:33 +0200)]
doc: Use canonical spelling of ChaCha20
Spelled with two capital Cs according to Daniel J. Bernstein. "ChaCha,
a variant of Salsa20." Workshop Record of SASC 2008: The State of the
Art of Stream Ciphers.
Loup Vaillant [Sun, 6 Jun 2021 21:47:11 +0000 (23:47 +0200)]
Advertise public key cryptography in the manual
Followup on the modifications of the README: if users can't make the
connection between public key cryptography and key exchange there, they
won't make it either when reading the manual (which by the way is
automatically copied to the website).
Loup Vaillant [Fri, 4 Jun 2021 21:51:20 +0000 (23:51 +0200)]
Fixed local variable shadowing
In crypto_x25519_inverse(), line 2949 and 2953, we use the ZERO macro in
a context where the enclosing scope already defines the varible `i`.
Turns out ZERO defines `i` for internal use in an enclosed scope. This
trigger a warning in some compilers about variable shadowing: declaring
a local variable with the same name as an enclosing local variable. This
warning is especially annoying when combined with -Werror.
To prevent this, the macros COPY and ZERO have been modified so they use
a variable name that is unlikely to be used anywhere else (`i__`).
Loup Vaillant [Fri, 4 Jun 2021 21:21:23 +0000 (23:21 +0200)]
Advertise Monocypher's features more clearly
For the second time, I've stumbled upon a user that believed Monocypher
did not support public key cryptography, despite the presence of a
fairly high-level key exchange API.
Looking at the README and website, I noticed that while key exchange is
fairly well publicised, "public key cryptography" is not. I suspect
many users simply don't know that key exchange is a valid way to do
public key cryptography. In hindsihght, we should not expected them to.
Let's not dwell on how many potential users we may have lost to this
oversight. Hopefully, writing "Public key cryptography" directly will
help people notice.
Loup Vaillant [Sat, 22 May 2021 13:30:02 +0000 (15:30 +0200)]
Define fe_invert() in terms of invsqrt()
Pros:
- The code is a couple lines shorter.
- Stack usage may be lowered a tiny little bit (we saved two temporary
field elements).
Cons:
- Inversion is one field multiplication slower.
Note that fully optimised inversion can be 6 field multiplications
faster than the current version. It would cost quite a bit of code
though (we'd need a dedicated addition chain).
Loup Vaillant [Tue, 25 May 2021 18:11:47 +0000 (20:11 +0200)]
Install advanced man pages
Note that deprecated man pages are omitted.
I believe we should not clutter man pages with them.
Advanced functions however should have their place in the user manual.
Fabio Scotoni [Tue, 25 May 2021 14:51:41 +0000 (16:51 +0200)]
doc: Compute secret/public key in some examples
crypto_sign_init_first_pass(3monocypher) gets two initializations of sk
and pk.
They're long-lived keys, but at least now it's possible to run the
example without wondering why nothing works.
Loup Vaillant [Sat, 22 May 2021 13:46:44 +0000 (15:46 +0200)]
Simplified crypto_x25519_dirty_small() a tiny bit
To give the same results as crypto_x25519_dirty_fast(), we originally
multiplied the cofactor by 5 before we multiplied it by L. I noticed
however that this multiplication by 5 could be baked in the base point
itself, and simplifies the computation a little bit.
Should have done this years ago. The test vectors we had were pretty
good, but the official ones are, well, official.
Note that we kept a good deal of our own vectors: while the official
test vectors tested all possible key sizes, they did not test all
possible hash sizes or all possible key sizes.
Loup Vaillant [Wed, 31 Mar 2021 19:42:13 +0000 (21:42 +0200)]
Added tests vectors for Blake2b
- We tested with no key, and an empty message.
- We tested with no key, and a non-empty message.
- We tested with a key, and a non-empty message.
- We did *not* test with a key and an empty message.
Well, now we do. Libsodium seems to agree with us. Phew.
IETF ChaCha20 has a 32-bit counter.
This means a practical limit of 256 GiB of data for each nonce.
Additionally, IETF QUIC seems to require being able to handle
0xffffffff (I-D.draft-ietf-quic-tls-33 § 5.4.4),
thus getting very close to the overflow,
though not triggering it.
Unlike libsodium and other libraries, we do not have the option to
panic and take down whatever process is running the code and triggering
the overflow condition because Monocypher is neither allowed to use the C
standard library nor allowed to invoke undefined behavior to cause a
crash;
the applicable RFC provides no guidance what to do in this case, either.
Therefore, staying within the (nonce, counter) limits is necessarily
application responsibility;
it is an invariant that, when voided, Monocypher is allowed to do
anything,
similar to the non-guarantee we make for the crypto_blake2b family
and the crypto_argon2i family.
While already here, fix the wrong function prototype in the synopsis.
Loup Vaillant [Fri, 4 Dec 2020 23:19:40 +0000 (00:19 +0100)]
Restricted TIS-CI to the most different platforms
We don't care about floating points & such, so we only need to select
a few platforms among all proposed. This change cuts the number of
platforms by half.
Note: I hoped for a 6% speedup, barely observed 3.5%. I suspect this is
because additions, even from pre-computed tables, cost a little more
than doublings.
Note: we could save an additional addition by assuming scalars modulo L
all fit in 252 bits. They do not, but if we pick one at random, they
will in practice (with 2^-128 probability of being wrong, i.e. never).
This would work well in EdDSA, where the scalar is a hash of the private
key. Finding a private key that makes the scalar overflow 252 bits is
about as hard as breaking a 128-bit key, which we can safely assume will
never happen by accident.
However, this scalar multiplication is also used in the dirty public key
generation, which we use to create hidden X25519 keys with Elligator
(from Edwards, because it's twice as fast as the Montgomery ladder).
Here, users provide the scalar directly. Overflowing 252 bits *can*
happen by accident, if users shoot themselves in the foot with a
non-random scalar.
The risk is small, but a measly 1% performance is not worth leaving
ourselves open to subtle corner cases like that.
Loup Vaillant [Sun, 29 Nov 2020 20:27:25 +0000 (21:27 +0100)]
Faster reduction modulo L
Replaced TweetNaCl's code by Barrett reduction.
- mod_l(), reduce(), and mul_add() use less stack.
- mod_l(), reduce(), and mul_add() are now much faster.
- Signature generation is noticeably faster (~7% on my Skylake laptop).
- Now I understand all the code. No more black boxes.