Skip to content

Remove MDx_HashFunction#3550

Closed
randombit wants to merge 1 commit intomasterfrom
jack/remove-mdx-hash
Closed

Remove MDx_HashFunction#3550
randombit wants to merge 1 commit intomasterfrom
jack/remove-mdx-hash

Conversation

@randombit
Copy link
Copy Markdown
Owner

No description provided.

@randombit randombit force-pushed the jack/remove-mdx-hash branch from 601b85b to 0404f62 Compare May 17, 2023 21:46
@coveralls
Copy link
Copy Markdown

coveralls commented May 17, 2023

Coverage Status

Coverage: 91.723% (+0.04%) from 91.688% when pulling efc3a2c on jack/remove-mdx-hash into 8bedd1d on master.

@randombit randombit force-pushed the jack/remove-mdx-hash branch from 0404f62 to efc3a2c Compare May 17, 2023 23:36
Copy link
Copy Markdown
Collaborator

@reneme reneme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this example to help moving the Keccak discussion forward. I believe that makes your point clearer. To my mind the concept of Keccak_c and MDx_Hash is very similar.

Nevertheless, I'm still trying to wrap my head around why this inheritance was a problem. Is there a technical reason that I'm still missing?

For the sake of discussion, let me use this PR as a base for an alternative suggestion. Don't get me wrong: I'm fine with it either way. I just got hooked by the design space and would like to get to the bottom of it. 😇

Comment on lines +20 to +28
template<MD_Endian ENDIAN,
typename DIGEST_T,
size_t DIGEST_ELEM,
void init_fn(DIGEST_T[DIGEST_ELEM]),
void compress_fn(DIGEST_T[DIGEST_ELEM], const uint8_t[], size_t),
size_t BLOCK_BYTES = 64,
size_t DIGEST_LENGTH = DIGEST_ELEM * sizeof(DIGEST_T),
size_t CTR_BYTES = 8>
class MD_Hash final
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me suggest an alternative design using C++20 concepts and a CRTP-ish structure: Instead of this long list of parameters, we could write a concept to tie down the parameters each MDx-hash needs to define (as static constexpr members). Then, the instantiation of MDx_Hash in the concrete implementations shrinks to: MDx_Hash<MD5> m_md;:

template <typename T>
concept mdx_hash_implementation =
    requires(T::digest_type digest, uint8_t input[], size_t blocks, MD_Endian endian) {
        typename T::digest_type;
        T::endianness;
        T::block_bytes;
        T::ctr_bytes;
        T::init(digest);
        T::compress_n(digest, input, blocks);
    } &&
    T::block_bytes >= 64 && is_power_of_2(T::block_bytes) &&
    T::ctr_bytes >= 8 && is_power_of_2(T::ctr_bytes) &&
    T::ctr_bytes < T::block_bytes &&
    sizeof(typename T::digest_type) >= 16;
    

template <mdx_hash_implementation T>
class MDx
{
    std::array<uint8_t, T::block_bytes> m_buffer;
    T::digest_type m_digest;
    uint64_t m_count;
    size_t m_position;
};

class MD5
{
public:
    using digest_type = std::array<uint32_t, 4>;
    static constexpr MD_Endian endianness = MD_Endian::Little;
    static constexpr size_t block_bytes = 64;
    static constexpr size_t ctr_bytes = 8;
    
public:
    static void init(digest_type& digest);
    static void compress_n(digest_type& digest, const uint8_t input[], size_t blocks);

private:
    MDx<MD5> m_md;
};

That surely is a bit more boilerplate in the concrete implementations. But I see a few advantages:

  • It lends itself from the usual STL members: e.g. std::vector<>::value_type
  • Readability: Instead of a long list of magic numbers in the MDx_Hash instantiation, each parameter has a speaking name
  • DRY: Parameter definitions can be referred to in the implementation classes (if needed) -> less implicit knowledge: e.g. digest_type encapsulates the digest_length and digest_element_type and the init()/compress_n() methods don't need to repeat it
  • Parameter constraints are declared as a concept, hopefully resulting in better compiler errors

Certainly this can be further refined.

Godbolt: https://godbolt.org/z/1h1nzcYoh

Comment on lines +110 to 133
void MD5::add_data(const uint8_t input[], size_t length)
{
m_md.add_data(input, length);
}

void MD5::final_result(uint8_t output[])
{
m_md.final_result(output);
}

/*
* Clear memory of sensitive data
*/
void MD5::clear()
{
MDx_HashFunction::clear();
zeroise(m_M);
m_digest[0] = 0x67452301;
m_digest[1] = 0xEFCDAB89;
m_digest[2] = 0x98BADCFE;
m_digest[3] = 0x10325476;
m_md.clear();
}

std::unique_ptr<HashFunction> MD5::new_object() const
{
return std::make_unique<MD5>();
}

std::unique_ptr<HashFunction> MD5::copy_state() const
{
return std::make_unique<MD5>(*this);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that each MDx hash inherits straight from HashFunction, these "plumbing methods" are repeated for each class. From what I can see, they all simply forward to their respective m_md or create a copy of themselves. Its just copy-and-paste boilerplate, but it does bug me somewhat.

With the full CRTP-approach I suggested earlier this would go away. At the expense of inheriting from an MDx template.

Comment on lines +81 to +82
MD_Hash<MD_Endian::Big, uint64_t, 16,
SHA_384::init, SHA_512::compress_digest, 128, 48, 16> m_md;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that correct? The compress and init methods take [8] elements.

Suggested change
MD_Hash<MD_Endian::Big, uint64_t, 16,
SHA_384::init, SHA_512::compress_digest, 128, 48, 16> m_md;
MD_Hash<MD_Endian::Big, uint64_t, 8,
SHA_384::init, SHA_512::compress_digest, 128, 48, 16> m_md;

Do I misunderstand something or is that a copy-and-paste error? (also for the SHA-flavours in this file)

@reneme
Copy link
Copy Markdown
Collaborator

reneme commented May 19, 2023

For reference as promised: #3553. I've taken this patch as a starting point and moved common stuff into a CRTP base class. What do you think?

@reneme
Copy link
Copy Markdown
Collaborator

reneme commented Sep 5, 2023

This needs some consideration:

  1. clang-format 😨
  2. Interdependence with std::span for Buffered_Computation #3681
  3. Consideration of Remove MDx_HashFunction #3550 (comment)

I'm willing to take this over. But would suggest to put it on hold until #3681 is merged. Edit: Let's also wait for #3693.

@reneme
Copy link
Copy Markdown
Collaborator

reneme commented Sep 18, 2023

I'm willing to take this over. But would suggest to put it on hold until #3681 is merged

I opened a follow-up PR (#3705).

@randombit
Copy link
Copy Markdown
Owner Author

Closing in favor of #3705

@randombit randombit closed this Sep 29, 2023
@randombit randombit deleted the jack/remove-mdx-hash branch October 6, 2023 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants