Seth Archambault [Thu, 19 Oct 2023 03:20:16 +0000 (23:20 -0400)]
Update crypto_argon2.3monocypher - fixes errors and warnings that can lead to incorrect solution
Closes: #264
Fixes these issues, by allowing password array to autosize, and then making sure to drop the \0 character when determining the size of the string, and also reorders the crypto_argon2_inputs field designators to remove a warning.
```
main.cpp:83:24: error: initializer-string for char array is too long, array size is 14 but initializer has size 15 (including the null terminating character)
uint8_t password[14] = "Okay Password!";
^~~~~~~~~~~~~~~~
main.cpp:87:5: warning: ISO C++ requires field designators to be specified in declaration order; field 'pass_size' will be initialized after field 'salt' [-Wreorder-init-list]
.salt = salt, /* Salt for the password */
^~~~~~~~~~~~~~~~~
```
Loup Vaillant [Tue, 17 Oct 2023 04:35:39 +0000 (06:35 +0200)]
Use NULL instead of 0 for null pointers
Turns out the standard guarantees that NULL is defined in `stddef.h`.
Contrary to what I used to believe, using it doesn't induce any further
dependency. `0` is also guaranteed by the standard to work, but it's
less explicit and in some cases more error prone.
Apparently the convention for the $(PREFIX) variable is to include the
leading '/'. Leaving it out and adding it manually confuses some
package systems, and force package maintainers to patch it when their
own $(PREFIX) already starts with '/'.
TIS-CI tests take forever. It's annoying and cost them compute power
for little benefit. A quick assesment of Monocypher reveals that the
only things we really care about are endianness and main word size.
Things like the sign of `char`, `sizeof(int)`, or `long double` are
mostly (or entirely) irrelevant.
So, all platforms supported by TIS-CI, only 5 are relevant:
- 16-bits little endian (I chose x86_16)
- 32-bits little endian (I chose x86_32)
- 32-bits big endian (I chose sparc_32)
- 32-bits little endian (I chose rv64ifdq)
- 32-bits big endian (I chose mips_64)
With the help of a (now updated) `doc_extract_examples.sh` script.
Note: We may want to integrate this script in the test suite, if we end
up writing more documentation.
Argon2 failed to conform to the reference implementation when used with
multiple lanes, rendering it useless for this compatibility use case.
The error came from the way we select the reference set:
- On the first slice of the first pass, only the current lane is valid.
- When selecting other lanes, only fully completed segments are valid.
- The previous block of *all* lanes must be excluded.
Compilers aren't magical. They need help to generate the best code.
Here we want to compute the following expression:
mask = 0xffffffff;
2 * (a & mask) * (b & mask)
The most efficient way to do this looks like this:
u64 al = (u32)a; // Truncate
u64 bl = (u32)b; // Truncate
u64 x = al * bl; // 32->64 bits multiply
u64 2x = x << 1; // shift
return 2x;
My compiler doesn't pick up on this, and perform a slower alternative
instead. Either the multiply by two uses an actual multiply instead of a
shift, or the shift is done first, forcing a more expensive 64->64
multiply. More naive compilers may even do both.
Whatever the cause, I got 5% faster code on GCC 11.3.
Loup Vaillant [Wed, 22 Mar 2023 22:49:16 +0000 (23:49 +0100)]
Modify Blake2b context input to byte buffer
Though it requires a (safe because it's all aligned) cast at one point,
it makes the code simpler and significantly speeds up non-aligned
incremental hashes.
Surprisingly, foregoing word-by-word loading at the begining of the
update doesn't slow anything down, but forgoing it at the end *does*.
So while we align with block boundaries directly, we end up copying the
remaining words first, then the remaining bytes.
Loup Vaillant [Wed, 22 Mar 2023 21:51:22 +0000 (22:51 +0100)]
Rename align() to gap() to avoid confusion
The name "align" made readers believe it returns the next multiple,
while in fact it's returning how much we need to get there.
The name "gap" was suggested to me, and I haven't found better. A fully
descriptive name would likely be quite long, and wouldn't preclude the
need to look the definition up anyway. (And I suspect even now one could
guess from context.)
Loup Vaillant [Sun, 5 Mar 2023 10:25:07 +0000 (11:25 +0100)]
Split installation into 3 sub targets
Namely:
- `install-lib`: library files and headers.
- `install-doc`: man pages.
- `install-pc` : pkg-config file.
Some users don't want to install everything. Those who pull Monocypher
directly from git, despite strong suggestion to use a tarball instead,
can be especially annoyed at the mandoc dependency.
We could also split the library and includes, but this feels overkill.
Loup Vaillant [Mon, 27 Feb 2023 13:24:05 +0000 (14:24 +0100)]
Move relevant parts of tests/utils.c to tests/gen
The main point is to remove dead code from tests/utils.c, and have the
tarball be as clean as possible. Incidentally this also cleans up the
dependencies for test vector generation as well.
Loup Vaillant [Sun, 26 Feb 2023 10:10:48 +0000 (11:10 +0100)]
Generate docks with `make all`
That way the docs no longer have to belong to root when we generate
them. Some more dist.sh shenanigans were required to spare tarball
users the mandoc dependency. Mostly making sure their doc is not
deleted when they `make clean`.
Loup Vaillant [Sat, 25 Feb 2023 23:36:32 +0000 (00:36 +0100)]
Give myself some copyright info
Now I'm not exactly sure where the CAVEATS section comes from. It dates
back to 2017, but it uses wording that was previously in the manual
(which was originally all mine). It is possible, though I'm not sure,
that I'm misappropriating the work of @CuleX and/or @FScoto here.
Hopefully I'm not. Or maybe future historians will judge me.
Fabio Scotoni [Sat, 25 Feb 2023 10:13:24 +0000 (11:13 +0100)]
Resolve mandoc -Tlint nits
Several of them I have ignored intentionally:
1. unknown manual section comes with the definition;
2. unusual Xr order seems to have been an intentional decision
grouping by topic rather than alphabetically,
even if admittedly unusual;
3. no blank before trailing delimiter: Fa extern const
crypto_argon2_extras crypto_argon2_no_extras;
is another intentional choice,
with no reasonable markup existing for a global variable
declaration other than Bd.
Fabio Scotoni [Sat, 25 Feb 2023 09:59:41 +0000 (10:59 +0100)]
makefile: fix install-doc
During the documentation overhaul in 85a7c3742f06ab55fdf523a7a6a9cfe5cda09837,
generating the man page symlinks was automated
(introduction of doc/gen_doc.sh, now called doc/doc_gen.sh).
However, the install-doc target of the makefile fails to account for
the source folder not necessarily existing.
This change runs doc/doc_gen.sh before attempting to install
the man pages.
This has the questionable side effect of creating a folder as root and
creating a bunch of files as root (including the HTML files,
i.e. running mandoc as root) when doing sudo make install,
but the average user will "just" install and forget about it anyway.
Loup Vaillant [Sun, 19 Feb 2023 23:31:32 +0000 (00:31 +0100)]
Fixed ctgrind
Note that Argon2 reports a "use" of uninitialised value. It does not
appear to be an secret dependent branch, or even index, but I didn't
expect the warning.
Loup Vaillant [Fri, 10 Feb 2023 15:56:12 +0000 (16:56 +0100)]
Better doc integration
Now some checks must pass before we generate the docs:
- .Nm and .Fo names are identical.
- All .Fn names are referenced in .Nm and .Fo.
- All functions from the headers have an .Nm reference.
- No .Nm reference documents a non-existent function.
- No .Xr reference is dead.
In practice this allowed me to catch many stale references very easily.
Since those were basicaly mechanically caugth typos, I did not update
the date nor copyright information in the affected documents.
Loup Vaillant [Fri, 3 Feb 2023 17:18:58 +0000 (18:18 +0100)]
Improved SHA-512 speed on small inputs.
Not as much as BLAKE2b because SHA-512 core is slower, but still.
The cost here is about 10 lines of code, and an additional couple
dozen bytes in the binary.
Loup Vaillant [Thu, 2 Feb 2023 23:14:15 +0000 (00:14 +0100)]
Document BLAKE2b KDF, change BLAKE2b API *again*
And optimised `crypto_blake2b_update()` on short or unaligned inputs.
This also clarifies the source code somewhat, though it's now a bit more
verbose. That verbosity does give us over 15-30% speed on small updates
& small inputs typical of key derivation schemes.
---
The reason for the rework of the API is that the struct argument simply
does not work in practice. See, BLAKE2b has not one, but *two* typical
usages: regular hashing, and keyed hashing. There are many situations
where controlling the size of the hash is useful even when we don't use
the key, and when we do keyed hashing (for MAC or KDF), having to use
the struct is just *so* verbose and tedious.
I briefly considered having just one hash function to rule them all with
6 arguments, but regular hashing still is the main use case. So instead
of a struct or the original split, I simply have:
* `crypto_blake2b()` with a variable sized hash.
* `crypto_blake2b_keyed()` with every option.
and in practice, the 6 arguments of the keyed version are quite
manageable: output first as always, then the key, then the message.
And if arguments get too long the pointer/size pairs give us natural
line breaks.
I had to work on KDFs to realise this so thanks @samuel-lucas6 for
bringing the whole issue up.
---
Unlike SHA512, I did *not* add an explicit KDF interface for BLAKE2b.
My reasons being:
* There is no clear standard KDF for BLAKE2b except HKDF.
* HKDF is build on top of HMAC, and HMAC is stupid when BLAKE2b already
has a keyed mode that does the exact same thing with less code and
fewer CPU cycles.
* HKDF is kind of *a little* stupid itself.
* `crypto_blake2b_keyed()` already implements KDF extract.
* `crypto_chacha20_*()` already implement KDF expand.
* I hesitate to implement a non-standard convenience function users are
unlikely to get catastrophically wrong.
So instead I updated the manual, including 3 full KDF functions lazy
users can just blindly copy & paste.
Loup Vaillant [Wed, 1 Feb 2023 22:03:06 +0000 (23:03 +0100)]
Add HKDF SHA-512
In principle, we can imitate HKDF quite easily with SHA-512 alone.
Being fully RFC compliant however is fiddly and tedious, so for users
who want to do key derivation with SHA-512 in the most standard way
possible need dedicated functions.
Note the absence of a `crypto_sha512_hkdf_extract()` function, which
would be nothing more than an alias of `crypto_sha512_hmac()`. Aliases
to the equivalent incremental interface are also absent. There are pros
and cons to both the presence and absence of those aliases, I personally
prefer to leave them out.