Loup Vaillant [Wed, 4 Dec 2019 18:24:59 +0000 (19:24 +0100)]
Renamed "crypto_hmac_*" to "crypto_hmac_sha512_*"
There are several types of HMAC, and users may want to use other
versions of HMAC as well. (For instance, they could code their own
Blake2b HMAC to implement Noise). Plus, most primitives are named by
their technical name. "hmac" alone is not enough.
The names are longer, but this is the optional part, after all.
Loup Vaillant [Tue, 3 Dec 2019 07:52:01 +0000 (08:52 +0100)]
Moved SHA 512 work area to local stack
While some users could perhaps benefit from saving 640 bytes of stack
space by allocating the context statically, or in the heap, in practice
it's not he bottleneck. Besides, putting the work area there actually
*increases* stack usage on signatures and signature verification, which
are the most stack hungry parts of Monocypher to begin with.
Loup Vaillant [Mon, 2 Dec 2019 22:49:25 +0000 (23:49 +0100)]
Fixed HMAC SHA-512 (and added tests)
Test vectors were generated with Libsodium, with various key sizes (both
shorter and longer than the message), and every message size from 0 to
256 (twice the SHA 512 block size).
Also added Test vectors from RFC 4231, except the one with truncated
output (we don't support truncated outputs, users will have to do that
manually).
Loup Vaillant [Sun, 1 Dec 2019 21:36:00 +0000 (22:36 +0100)]
Added HMAC SHA512
EXPERIMENTAL. MAY BE REMOVED.
Monocypher is supposed to be small. This is why we use Blake2b for both
Argon2 and EdDSA signatures. Some users however need Ed25519 for
compatibility with other tools. This means using SHA 512.
We could hide SHA 512 from the public interface entirely, but this seems
like a waste: it could replace Blake2b to make the library smaller. It
will come at a performance loss, but when you verify signatures on a
small device, the hash is rarely the bottleneck.
The main problem with SHA 512 is length extension attacks. It just
cannot be used as a prefix MAC like Blake2b can. We need HMAC if we
want SHA 512 to entirely displace Blake2b, so the Monocypher binary
stays small.
Users could use Poly1305 and our version of RFC 8439 of course, but if
they're so tight on space, they're likely to get rid of Poly1305 as
well. When we have SHA 512 already, HMAC requires much less code.
This is kind of a special corner case. But it could come in handy.
Loup Vaillant [Sun, 1 Dec 2019 12:57:17 +0000 (13:57 +0100)]
Renamed crypto_hash_vtable into crypto_sign_vtable
The vtable holds hash functions, but it's really a vtable for
crypto_sign_ctx_abstract (and its check typedef). It's more tied to
EdDSA than to the hash itself.
Loup Vaillant [Sun, 1 Dec 2019 11:01:15 +0000 (12:01 +0100)]
Renamed crypto_sign_blake2b_ctx back to crypto_sign_ctx
Also renamed crypto_check_blake2b_ctx back to crypto_check_ctx.
This serves two purposes: avoid breaking the API when users upgrade from
Monocypher 2.x, and keep the idea that Blake2b is the default hash (the
default settings are implied and need not be named).
Note that although old code is not broken, it will still have warnings.
Those are easily silenced by casting to (void*).
Loup Vaillant [Sun, 1 Dec 2019 10:42:30 +0000 (11:42 +0100)]
Fixed undefined function pointer conversion
The TIS interpreter is not happy when we call a function from an
incompatible pointer type. GCC and Clang don't seem to mind as long as
we explicitly convert the pointer, but apparently that's undefined
behaviour, even though the only incompatibility is transforming a
pointer argument into a void* argument.
I don't know if it's a false positive, but better safe than sorry. The
conversion now uses explicit wrappers instead of a brutal type cast.
I've taken the opportunity to remove the offset. The wrappers now
perform the offset themselves, by accessing the member field the normal
way (after converting from void*, but that can't be avoided).
Loup Vaillant [Sat, 30 Nov 2019 23:08:08 +0000 (00:08 +0100)]
chacha20_*_ctr functions now return the new ctr
This should facilitate building piecemeal streams. Normally you'd just
increment the nonce, but in some (admittedly rare) cases we may want to
increment the counter instead.
Incrementing the counter is fairly dangerous, because we may overlap the
streams, thus revealing the XOR of two pieces of plain text. Using the
new return value makes sure this doesn't happen.
Loup Vaillant [Sat, 30 Nov 2019 19:36:28 +0000 (20:36 +0100)]
Enabled cohabitation of several EdDSA instances
EdDSA can now use a custom hash! And that hash is not set in stone at
compile time, it can be decided at runtime! It was done inheritance and
subtype polymorphism. Don't worry, we are still using pure C.
Custom hashes are defined through vtables. The vtable contains function
pointers, an offset, and a size. (We need the size to wipe the context,
and the offset to find the location of the hash context inside the
signing context.)
An abstract signing context is defined. It is not instantiated
directly. It is instead the first member of the specialised signing
context. The incremental interface takes pointers to abstract contexts,
but actually requires specialised contexts.
By default, we use the Blake2b specialised context. The incremental
interface doesn't change, except for the need to give it a specialised
context instead of the old crypto_sign_ctx. To enable the use of
different contexts, 3 "custom_hash" functions have been added:
This lets us preserve the old function names (making it easier to update
user code), and maybe conveys that Blake2b remains the default hash.
---
Overall, I think we did pretty good: only 3 additional functions in the
main library (and a fourth exported symbol), and we spare the user the
pain of juggling with two contexts instead of just one. The only
drawback are slightly breaking compatibility in the incremental
interface, and requiring an explicit cast to avoid compiler warnings.
The streaming interface for AEAD was a bad idea: it's harder to test and
encourages unsafe protocol design (unsafe handling of unauthenticated
data, denial of service amplification...).
Michael Forney [Tue, 19 Nov 2019 20:15:16 +0000 (12:15 -0800)]
Remove unnecessary dependency on 2's complement
Although the bit-representation of signed integer types in C99 is
implementation-defined and can be sign-magnitude, one's complement, or
two's complement[0], the conversion of negative values to an unsigned
integer type is defined to be adding 1 plus the maximum value of the
unsigned type[1].
Since -1 + 0xffffffff + 1 == 0xffffffff, just using u32 here has the
right behavior without relying on the representation of signed integers.
Loup Vaillant [Mon, 18 Nov 2019 21:41:41 +0000 (22:41 +0100)]
Leveraged fe_pow22523 to to simplify fe_invert
The multiplication chain used in those two function is probably optimal,
but it is also kind of black magic, and takes quite a bit of code.
TweetNaCl has a much shorter, much easier to read, much slower addition
chain. I figured maybe a middle ground were possible.
Turns out it's difficult. I couldn't come up with a nice multiplication
chain on my own. But I did notice a relationship between 2^252 - 3 and
2^255 - 23 (the latter is used to invert): they start with the same bit
pattern. More specifically:
2^255 - 23 = (2^252 - 3) * 8 + 3
I can use the same multiplication chain for both function, and just
finish the job for the inversion.
The cost of this patch compared to the ref10 multiplication chain is
five field multiplications, three of which are squaring. The effect on
the benchmark is so small that we don't even notice the difference.
The benefit is 10 meaty lines of code, and a corresponding decrease in
binary size.
Loup Vaillant [Tue, 22 Oct 2019 21:38:15 +0000 (23:38 +0200)]
Fixed Clang warning about Doxygen comments
comments that begin by //< can be Doxygen comments, and Clang with all
warnings doesn't like that.
I originally packed the comment to satisfy my 80 column OCD. By
sacrificing space around the + operator however, we can reclaim that
space and please Clang.
Loup Vaillant [Mon, 21 Oct 2019 12:57:03 +0000 (14:57 +0200)]
Cleaned up the tests/ folder
Just moving files around so it's better organised.
Also changed the vectors.h header a little:
- It now includes inttypes.h and and stddef.h only once.
- There's a note at the top saying where it comes from.
Loup Vaillant [Sat, 19 Oct 2019 23:01:43 +0000 (01:01 +0200)]
Tightened up the release script
- Run tests/test.sh prior to release
- Removed the dist target from the shipped makefile
- Removed the contributor notes from the shipped README
- Don't include files that only serve to generate vectors.h
- Reworded some of README a little bit.
Exposing version numbers in the binary can expose them to attackers.
Without the version number, they have to try the exploit and hope. With
the version number, they may perform a cheap check before they proceed
any further. Better not take the risk.
Furthermore, changing the length of the string may break ABI. This will
happen if a version number (major, minor, or patch) ever reaches 10.
That patch was nice, but it potentially impact security and stability.
Not worth it in the end.
Loup Vaillant [Sat, 19 Oct 2019 13:14:48 +0000 (15:14 +0200)]
Added version number to binaries
Sometimes, we don't have the sources, and we want to check the version
number of the binaries themselves. (For instance when distributing
Monocypher as a library.)
To that end, I've added the global string constant "monocypher_version".
It can be used from the calling program, or scanned directly by tools.
Loup Vaillant [Sat, 19 Oct 2019 12:44:28 +0000 (14:44 +0200)]
Include version in released source files
I realised that determining which Monocypher version was used in a
project was not trivial. We could look at Monocypher's code and deduce
the release, but that's tedious and error prone. So I've made those
versions more explicit:
- Source and header files begin by a comment describing the version.
- The pkg-config file created by `make install` include that version.
- The version number of unreleased code (under git) is "__git__"
- The version number of released code is whatever `git describe --tags`
tells us.
- the "tarball" target in the makefile was changed to the more standard
"dist".
To release a new version, we just add a tag, then call `makefile dist`.
The version of the released source file will appear at a glance, right
there on the first line.
Note: the release process blindly replaces all instances of "__git__" by
the suitable version number. This could be used to version things other
than comments, like string constants.
Loup Vaillant [Wed, 16 Oct 2019 22:14:11 +0000 (00:14 +0200)]
Tidied up sliding windows, minor cosmetic nitpicks
Added `static` to the sliding window functions, reworked those functions
a bit to improve the (internal) API. Simplified the double scalarmult
accordingly.
Added FOR_T macro, for when the index should be a type other than
size_t. Helped remove explicit conversions in Argon2i and sliding
windows. Hopefully this new macro will be obvious to reviewers. I
could have used the regular `for` loop, but it took too much horizontal
space in Argon2i (we use long names there).
Loup Vaillant [Sun, 13 Oct 2019 23:06:43 +0000 (01:06 +0200)]
Updated AUTHORS.md for EdDSA
The EdDSA code is now unrecognisable from what we saw in either SUPERCOP
and TweetNaCl. Some significant pieces are still from ref10 or
TweetNaCl, but the overall structure is different enough that I should
consider myself the primary author...
Loup Vaillant [Sun, 6 Oct 2019 22:45:05 +0000 (00:45 +0200)]
Fused sliding windows and scalar multiplication
At last, we saved some stack. 320 bytes on my machine, which is a bit
disappointing. We may be able to shave off a couple more, but we're
reaching the limit.
Loup Vaillant [Sun, 6 Oct 2019 21:58:38 +0000 (23:58 +0200)]
Incremental left to right sliding windows
The main loop of the scalar multiplication goes one by one, so we can't
have the sliding loop skip indices. By adding a context that keeps
track of the next needed addition (as well as its value), we'll be able
to fuse the two slides and the scalar multiplication together.
Loup Vaillant [Sun, 6 Oct 2019 20:12:42 +0000 (22:12 +0200)]
Slide from left to right
Scalar multiplication goes from left to right (from MSB to
LSB). Computing the sliding windows used to go from *right to left*.
This direction mismatch forced us to keep all the signed digits in
memory, which currently incur a little over 500 bytes of stack overhead.
That overhead is avoidable. Avoiding it will allow Monocypher to fit in
smaller embedded devices.
Right now we just change the direction of the sliding. Interleaving will
come later.
Those are easily visible through the QtCreator IDE intellisense, but
somehow never showed up when compiling at the command line. This should
help silence MSVC warnings as well.