Skip to content

Review some of the existing character control fields for serialize/deserlization consistency and validation #2714

@derekpierre

Description

@derekpierre

Follow-up issue related to work done for #2713 .

Related to #2702 .

Some of the field definitions for the character control mechanism are inconsistent when comparing serialization/deserialization.

A demonstrative example would be the UmbralSignature field.

class UmbralSignature(BaseField, fields.Field):

    def _serialize(self, value: Signature, attr, obj, **kwargs):
        return b64encode(bytes(value)).decode()

    def _deserialize(self, value, attr, data, **kwargs):
        if isinstance(value, bytes):
            return value
        try:
            return Signature.from_bytes(b64decode(value))
        except InvalidNativeDataTypes as e:
            raise InvalidInputData(f"Could not parse {self.name}: {e}")

    def _validate(self, value):
        try:
            Signature.from_bytes(value)
        except InvalidNativeDataTypes as e:
            raise InvalidInputData(f"Could not parse {self.name}: {e}")

Within the marshmallow.fields library that we use for our fields, the result of the _deserialize function gets passed to the _validate function. In this case either bytes or an actual Signature object can be returned by _deserialize, but _validate seems to only care about bytes and would fail validation if a Signature object was passed to it.

Also, why doesn't _deserialize not just return a Signature object, why does it need to be either bytes or Signature? I suspect it was working within a previously written python API to work, so it was required to work with legacy code.

For example it could be made to instead be:

class UmbralSignature(BaseField, fields.Field):

   def _serialize(self, value: Signature, attr, obj, **kwargs):
       return b64encode(bytes(value)).decode()

   def _deserialize(self, value, attr, data, **kwargs):
       if not isinstance(value, bytes):
           value = b64decode(value)

       try:
           return Signature.from_bytes(value)
       except InvalidNativeDataTypes as e:
           raise InvalidInputData(f"Could not parse {self.name}: {e}")

Another inconsistency is the Cleartext field:

class Cleartext(BaseField, fields.String):

    def _serialize(self, value, attr, data, **kwargs):
        return value.decode()

    def _deserialize(self, value, attr, data, **kwargs):
        return b64encode(bytes(value, encoding='utf-8')).decode()

The _serialize and _deserialize functions are not opposites i.e. the output of serialize can't be passed to deserialize which is a bit strange.

There may be other inconsistencies across the character control that should be investigated and addressed if need be. This may break tests, but may be worth it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Bug 🐛Broken functionalityCLIThis effects the nucypher CLIWebWebpages

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions