Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An implementation of the Chinese SM3 hash algorithm #4616

Closed
wants to merge 2 commits into from

Conversation

ronaldtse
Copy link
Contributor

@ronaldtse ronaldtse commented Oct 30, 2017

This PR introduces the Chinese SM3 hash algorithm, which produces a 256-bit digest.

This was ported from Ribose's contribution to Botan, and contains some minor changes to make its performance on par with Botan's SM3 implementation.

Currently openssl speed -evp sm3 reports:

~270 MB/s on @randombit 's machine
~250 MB/s on @ronaldtse 's macOS Macbook Pro 3.1 GHz
This is a contribution by Ribose Inc. and was done with knowledge of @richsalz, @levitte, @InfoHunter.

Questions:

  • Currently there is a doc/man3/SM3.pod, but the non-EVP interface is discouraged. Should it be moved to doc/man3/EVP_sm3.pod (and hence the breakup of the digests pod files)?
  • Most hash algorithms don't seem to have a test/recipe/ except HMAC. Do we need one?

cc: @randombit @dewyatt @erikbor

This is a contribution from Ribose Inc..

Checklist
  • documentation is added or updated
  • tests are added or updated

@ronaldtse ronaldtse changed the title Add sm3 hash An implementation of the Chinese SM3 hash algorithm Oct 30, 2017
@paulidale
Copy link
Contributor

For the first question, I suspect the consensus will be that the low level calls SM3_Init, SM3_Update, SM3_Final and SM3 shouldn't be exposed. This means no need for the SM3.pod file, the sm3.h should be internal and the functions renamed sm3_init, sm3_update and sm3_final and the SM3 function removed.

For the second question, there isn't a need for a standalone test for SM3. The tests in evpdigest.txt should cover it properly (& look like they do).

@@ -0,0 +1,55 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file could be placed in crypto/sm3 just as well. As a matter of fact, I'd like to encourage that. Have a look at crypto/blake2/ for inspiration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

@levitte
Copy link
Member

levitte commented Oct 30, 2017

I agree with @paulidale, making an EVP only interface has been the move for some time now.

@levitte
Copy link
Member

levitte commented Oct 30, 2017

Regarding lowercasing the function names, I'm not sure that's an absolute necessity. We haven't been picky about that before (see SM4... or Blake2, for that matter)...

@paulidale
Copy link
Contributor

paulidale commented Oct 30, 2017

I'm for lower casing them moving forwards. Upper case lead ins to names are generally exported.

We possibly should rework the non-exported upper case lead in names to be lowercase. I might get some time for this later in the week.

I've also a scheme to move to an EVP only interface -- this means replacing the historic direct calls with wrappers that use the EVP interface. It also renaming internal uses of these functions and (perhaps) the structures that are used. I.e. somewhat invasive :(

@paulidale
Copy link
Contributor

Out of interest, are there going to be submissions for SM2 and SM9 as well? The rest of the SM family are unpublished/untranslated as far as I'm aware.

@levitte
Copy link
Member

levitte commented Oct 30, 2017

I've also a scheme to move to an EVP only interface -- this means replacing the historic direct calls with wrappers that use the EVP interface. It also renaming internal uses of these functions and (perhaps) the structures that are used. I.e. somewhat invasive :(

I argued the exact same thing years ago, and personally encourage it. You need to know, though, that some are concerned with speed and how that will be affected by this move (that's the reason I never went ahead with this)

@ronaldtse
Copy link
Contributor Author

@paulidale @levitte restructured to internal EVP now, using lowercase function names.

Yes @paulidale we will be contributing SM2 as well, as @richsalz , @levitte and @InfoHunter already know. We haven't done SM9 but will see what happens!

D = TT1; \
F = ROTATE(F, 19); \
H = P0(TT2); \
} while(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whole point behind do { } while(0) in macro is to not end it with ;.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, introduced while converting this from an inline to a macro

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

const SM3_WORD A12_SM = A12 + E + TJ; \
const SM3_WORD SS1 = ROTATE(A12_SM, 7); \
const SM3_WORD TT1 = FF(A, B, C) + D + (SS1 ^ A12) + (Wj); \
const SM3_WORD TT2 = GG(E, F, G) + H + SS1 + Wi; \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's with const qualifiers? With variables one would customarily use it with assignment that can be evaluated at compile time, but here it makes no sense and asks for [unnecessary] question "why" ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the idiom used in the originating C++ code in Botan (any variable that is not mutated is marked const so it is clear to the reader no later modifications occur, without having to scan all later uses to verify this.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spirit of moving ahead, @dot-asm would you want us to remove the const?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use const in contexts like these as an aid to prevent programmer error -- the compiler catches attempts to assign to const variables by mistake. Not so much for when I first write the code but for the developer who gets to maintain it ten years later.

@dot-asm
Copy link
Contributor

dot-asm commented Oct 30, 2017

Regarding lowercasing the function names, I'm not sure that's an absolute necessity. We haven't been picky about that before (see SM4... or Blake2, for that matter)...

It has lesser to do export-ability, but rather with how algorithm is referred to in general. With all capitals customarily denoting abbreviations.

@dot-asm
Copy link
Contributor

dot-asm commented Oct 30, 2017

I've also a scheme to move to an EVP only interface -- this means replacing the historic direct calls with wrappers that use the EVP interface.

I'd say that this effectively encourages application developers to not rewrite their code, and I'd view it as negative thing. Or in other words direct calls should be simply privatized in next major release without any "convenience" patches. I for one would even take it step further and discourage application developers from using even things like EVP_sm3() (and by extension EVP_sha1()). Instead application should look it up by nid or symbolic name at run time and act accordingly. [Well, one might find it appropriate to compromise and say that those EVP_algo() that can't be disabled could be exported, but those which can be disabled shouldn't really.] This has everything to do with idea of "universal" headers. And just in case it wasn't clear, this is something that could be done only in next major release.

@kaduk
Copy link
Contributor

kaduk commented Oct 30, 2017

this is something that could be done only in next major release.

Do we have a wiki page or something where we are keeping track of major changes we want to make in the next major release?

@mattcaswell
Copy link
Member

Do we have a wiki page or something where we are keeping track of major changes we want to make in the next major release?

The closest thing we have AFAIK is the "1.2.0" label. There aren't any issues using that label, but there are a number of PRs that have that against them. We should probably create issues when we identifying this like this and assign the "1.2.0" label.


#include <openssl/evp.h>

const EVP_MD *EVP_digestname(void)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc-nits complains about this missing in the NAME section. In this case, it's just one name, so I think it would be better to simply have EVP_sm3 instead of EVP_digestname here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

doc/man3/SM3.pod Outdated
@@ -0,0 +1,76 @@
=pod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weren't you going to remove this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it's removed now. Just a reminder that with this change, now each EVP hash algorithm is going to need its own man file 😉

@paulidale
Copy link
Contributor

Were folks aware of http://gmssl.org/ ?
Some assembly versions are provided for i64.

@richsalz
Copy link
Contributor

Yeah, I took a look at it. Based on some advice I got during our recent China trip, it didn't get more than a cursory look.

@ronaldtse
Copy link
Contributor Author

@levitte I've removed all references of SM3 from EVP_DigestInit.pod for doc-nits to stop complaining.

INSTALL Outdated
synonymous with rmd160.
rc2, rc4, rmd160, scrypt, seed, siphash, sm3, sm4 or
whirlpool. The "ripemd" algorithm is deprecated and if used
is synonymous with rmd160.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was the indentation changed? We prefer all spaces indentation as tabs can be set different. You obviously use 4-column tabs ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh it was from a merge conflict between sm3 and sm4 😉 Thank you for this.

@levitte
Copy link
Member

levitte commented Oct 31, 2017

Also, I'm curious, how is GM/T 0004-2012 related to this?

@ronaldtse
Copy link
Contributor Author

The SM3 hash algorithm has been specified in multiple documents of differing importance.

The first public document of SM3 was published by OSCCA (State Cryptography Office) as a "Cryptographic Standard" GM/T 0004-2012. It was "upgraded" to a "China National Standard" by the SAC and TC 260 (China Standards Body) in GB/T 32905-2016 with some minor changes (to documentation, not to the algorithm).

@levitte
Copy link
Member

levitte commented Oct 31, 2017

Ok. It's curious in my view, but then I'm used to only refering to the last RFC we conform to, and that RFC having back references to older RFCs on the same matter. Either way, I'm not asking you to change your references, only to satisfy my curiousity... which you have now done, thank you :-)

@ronaldtse
Copy link
Contributor Author

It would be sad if no one was curious on something I found interesting, so thank you 😉

@levitte
Copy link
Member

levitte commented Oct 31, 2017

If there is an english version of GB/T 32905-2016, I'd love to read it. All I've been able to find is old IETF drafts, and I'd like to understand a bit better what corresponds to what

@ronaldtse
Copy link
Contributor Author

@levitte Is there anything further we can do to move this ahead? Thanks!

Copy link
Contributor

@paulidale paulidale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor nits only.

# error SM3 is disabled.
# endif

# ifdef __cplusplus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder if this is needed for an internal header file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.


#include <openssl/evp.h>
#include "internal/evp_int.h"
#include "internal/sm3.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A space between the # and include for these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

crypto/sm3/sm3.c Outdated

#ifndef OPENSSL_NO_SM3

#include <openssl/e_os2.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add spaces to these two too.

# endif

#define SM3_DIGEST_LENGTH 32
#define SM3_WORD unsigned int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're picking nits, there two need a space between # and defined

@ronaldtse
Copy link
Contributor Author

Will get the nits ironed out soon!

@ronaldtse
Copy link
Contributor Author

@levitte @paulidale the nits have been addressed. Please feel free to let me know if there’s anything further to be done. Thanks!

Copy link
Contributor

@paulidale paulidale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy.

Copy link
Member

@levitte levitte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that you need to do another make update. You will see fuzz/oids.txt get an update.

.travis.yml Outdated
@@ -168,7 +168,7 @@ script:
else
echo -e '+\057 MAKE UPDATE FAILED'; false;
fi;
git diff --quiet
git diff --name-only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip! Done.

Copy link
Member

@levitte levitte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Are you fine with us squashing (and editing your commit message) or do you prefer to do it yourself?

@ronaldtse
Copy link
Contributor Author

@levitte do you want to squash into one commit? I'm happy do so.

@ronaldtse
Copy link
Contributor Author

@levitte squashed to 2 commits now and rebased to latest master. Good to merge?

@levitte
Copy link
Member

levitte commented Nov 4, 2017

Looks good enough for me.

@paulidale
Copy link
Contributor

Need a rebase to master before merging.

Jack Lloyd and others added 2 commits November 6, 2017 07:21
SM3 is a secure hash function which is part of the Chinese
"Commercial Cryptography" suite of algorithms which use is
required for certain commercial applications in China.
@ronaldtse
Copy link
Contributor Author

@paulidale all done!

levitte pushed a commit that referenced this pull request Nov 6, 2017
SM3 is a secure hash function which is part of the Chinese
"Commercial Cryptography" suite of algorithms which use is
required for certain commercial applications in China.

Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from #4616)
levitte pushed a commit that referenced this pull request Nov 6, 2017
Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from #4616)
@paulidale
Copy link
Contributor

Merged, thanks.

@paulidale paulidale closed this Nov 6, 2017
@ronaldtse
Copy link
Contributor Author

Thank you @paulidale @levitte @mattcaswell @dot-asm and @richsalz !

@paulidale
Copy link
Contributor

Something the reviewers missed which I've added for SM3 and SM4 in #4678 but should go in for SM2 when you make that PR.

Things that can be disabled should be added to the openssl list -disabled command.

@ronaldtse
Copy link
Contributor Author

Indeed we will keep this in mind. Thanks!

busterb pushed a commit to libressl/fuzz that referenced this pull request Jan 23, 2019
Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from openssl/openssl#4616)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants