Skip to content

Support exposure of hash algorithm digest to handle OIDC at_hash, potentially other spec extensions #314

@sirosen

Description

@sirosen

#296 was closed with the suggestion that rather than special-case support for OIDC, PyJWT could support pluggable verifiers which handle additional, specialized claim validation behavior.

If there's a general market for verifiers -- or if they're going to be prepackaged and provided in this library as optional extensions -- that's a good avenue, but at present #295 is the only such issue.
Additionally, #258 runs directly counter to any extensible family of verifiers being hooked into PyJWT.

"Pluggable verifiers" seem therefore, at first blush, to be overwrought as a solution. I want PyJWT to support external verification of a claim based on the following information:

  • content of an arbitrary claim in the payload (trivially easy)
  • a digest from the hash algorithm specified in the JWT header (right now, hard, should be easy)

Client code should be able to extract this info without having to worry about how PyJWT behaves internally when cryptography is present.

The heart of the #296 implementation is really quite short:

        alg_obj = self.get_algo_by_name(algorithm)  # self is a PyJWT object
        hash_alg = alg_obj.hash_alg

        def get_digest(bytestr):
            if has_crypto and (
                    isinstance(hash_alg, type) and
                    issubclass(hash_alg, hashes.HashAlgorithm)):
                digest = hashes.Hash(hash_alg(), backend=default_backend())
                digest.update(bytestr)
                return digest.finalize()
            else:
                return hash_alg(bytestr).digest()

where get_algo_by_name is this segment abstracted into a function.

Having this hooked into the PyJWT.decode() call is not in any way necessary -- an external method like PyJWT.compute_hash_digest(<string>) would satisfy this just fine.

OIDC Client code would then look like this:

def verify_at_hash(decoded_id_token, access_token):
    digest = decoded_id_token.compute_hash_digest(access_token.encode('utf-8'))
    computed_at_hash = base64.urlsafe_b64encode(digest[:(len(digest) // 2)]).rstrip('=').decode('utf-8')
    assert decoded_id_token['at_hash'] == computed_at_hash

decoded = jwt.decode(...)
verify_at_hash(decoded, access_token)

However, there's a significant, nasty wrinkle: PyJWT.decode() only returns the payload and discards the header.

One of my initial inclinations is to make the payload a subclass of dict with additional attributes for signing_input, header, signature -- the other things returned from PyJWT._load() -- and any useful helper methods (in this case, compute_hash_digest).
This solution fixes the decision by PyJWT.decode() to discard potentially useful information. In fact, this is the critical information which makes it presently impossible to verify the OIDC at_hash with PyJWT.

If that's unattractive, the big alternative is to add jwt.compute_hash_digest(header, <input_string>), and add an argument to PyJWT.decode like verification_callbacks=... which is an iterable of callables which consume payload, header, signing_input, signature as inputs.

@mark-adams, which of these directions for PyJWT is more appealing to you? Do you see issues with exposing compute_hash_digest in one form or another?

Subclassing dict may sound messier at first, but it results in less code for PyJWT to maintain and doesn't expand the public API surface significantly vs verification_callbacks.
It also has a nice benefit over verification_callbacks in that adding attributes to an object if necessary in the future won't break most sane usage, but if people write their verification_callbacks without accepting **kwargs (which would be documented as incorrect, but people would still do it) then adding more information into those callbacks would break those clients' usage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleIssues without activity for more than 60 days

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions