From: Loup Vaillant Date: Fri, 22 Mar 2019 20:52:30 +0000 (+0100) Subject: Optimised Poly1305 loading code X-Git-Url: https://git.codecow.com/?a=commitdiff_plain;h=4635859c4c75fcfdf652491375cee96216df7170;p=Monocypher.git Optimised Poly1305 loading code By actually *rolling* the loading code. I haven't looked at the assembly, but I suspect the loop is easier for the compiler to vectorise. This results in a 5% speed increase on my machine (Intel i5 Skylake laptop, gcc 7.3.0). This fix was made possible by @Sadoon-AlBader on GitHub, who submitted pull request #118 --- diff --git a/src/monocypher.c b/src/monocypher.c index 55a0d4c..efa04f3 100644 --- a/src/monocypher.c +++ b/src/monocypher.c @@ -388,10 +388,9 @@ void crypto_poly1305_update(crypto_poly1305_ctx *ctx, // Process the message block by block size_t nb_blocks = message_size >> 4; FOR (i, 0, nb_blocks) { - ctx->c[0] = load32_le(message + 0); - ctx->c[1] = load32_le(message + 4); - ctx->c[2] = load32_le(message + 8); - ctx->c[3] = load32_le(message + 12); + FOR (i, 0, 4) { + ctx->c[i] = load32_le(message + i*4); + } poly_block(ctx); message += 16; }