Skip to content

Conversation

@s-block
Copy link
Contributor

@s-block s-block commented May 5, 2015

Wanted to be able to extend the session serializer class without duplicating all the code for object hook - as with the _tag function above.

@ThiefMaster
Copy link
Member

There's a typo in your commit message ("opbject").

@ThiefMaster
Copy link
Member

Wouldn't it be better to make it a static method of TaggedJSONSerializer? That way you can simply subclass it instead of having to rewrite the dumps/loads methods of that class or monkeypatch.

In the same go, it'd be nice to make those two methods static, too, and use the class itself instead of an instance of it. Those methods never access self anyway...

@s-block
Copy link
Contributor Author

s-block commented Jun 19, 2015

Yeah I agree. I will try and do this today.

@ThiefMaster
Copy link
Member

While you're at it, maybe #1511 could be interesting to fix at the same time.

@ThiefMaster
Copy link
Member

I'd move the untagging code into a method, too. It's a bit pointless to be able to override it for only one direction

@ThiefMaster
Copy link
Member

This is what I'd do btw (started working on it earlier today before I remembered that there's already your PR).

class TaggedJSONSerializer(object):
    """A customized JSON serializer that supports a few extra types that
    we take for granted when serializing (tuples, markup objects, datetime).
    """

    def dumps(self, value):
        return json.dumps(self._tag(value), separators=(',', ':'))

    def loads(self, value):
        def object_hook(obj):
            if len(obj) != 1:
                return obj
            the_key, the_value = next(iteritems(obj))
            rv = self._untag(the_key, the_value)
            return obj if rv is None else rv
        return json.loads(value, object_hook=object_hook)

    def _untag(self, key, value):
        if key == ' t':
            return tuple(value)
        elif key == ' u':
            return uuid.UUID(value)
        elif key == ' b':
            return b64decode(value)
        elif key == ' m':
            return Markup(value)
        elif key == ' d':
            return parse_date(value)
        return None

    def _tag(self, value):
        if isinstance(value, tuple):
            return {' t': [self._tag(x) for x in value]}
        elif isinstance(value, uuid.UUID):
            return {' u': value.hex}
        elif isinstance(value, bytes):
            return {' b': b64encode(value).decode('ascii')}
        elif callable(getattr(value, '__html__', None)):
            return {' m': text_type(value.__html__())}
        elif isinstance(value, list):
            return [self._tag(x) for x in value]
        elif isinstance(value, datetime):
            return {' d': http_date(value)}
        elif isinstance(value, dict):
            return dict((k, self._tag(v)) for k, v in iteritems(value))
        elif isinstance(value, str):
            try:
                return text_type(value)
            except UnicodeError:
                from flask.debughelpers import UnexpectedUnicodeError
                raise UnexpectedUnicodeError(u'A byte string with '
                    u'non-ASCII data was passed to the session system '
                    u'which can only store unicode strings.  Consider '
                    u'base64 encoding your string (String was %r)' % value)
        return value

@s-block s-block force-pushed the master branch 4 times, most recently from d1b7c08 to 616d66f Compare July 9, 2015 16:34
@s-block
Copy link
Contributor Author

s-block commented Jul 10, 2015

I have added a fix for issue #1511 as well as making the class extendible, could someone have a look and see what you think?

@ThiefMaster
Copy link
Member

Looks quite complex (don't have time to thoroughly review it right now). Also, it adds a list of valid tags at yet another place, making it harder to extend... maybe it'd be worth refactoring the whole thing to change the way the conversions are defined?

conversions = {
    't': {'check': lambda value: isinstance(value, tuple),
          'tag': lambda value: [self._tag(x) for x in value]},
          'untag': lambda value: tuple(value)},
    ...
}

or something similar

@s-block s-block changed the title Move opbject_hook outside loads method so class can be extend and reused Move object_hook outside loads method so class can be extend and reused Jul 13, 2015
@s-block
Copy link
Contributor Author

s-block commented Jul 17, 2015

So I have done something along the lines of your suggestion, conversions couldn't be a dict as it needs to retain order of checks, OrderedDict wasn't in python until 2.7 so tests would fail so I have made a list.

Also some tag/untag's couldn't be lambdas so had to declare self.conversions in init to access self._custom_tag/untag methods

Any thoughts? Looks like more code but will be much easier to extend:

def __init__(self):
    super(TaggedJSONSerializer, self).__init__()
    self.conversions.append(...)

)

def _is_dict_with_used_key(self, v):
return isinstance(v, dict) and len(v) is 1 and list(v)[0] in self.keys

This comment was marked as off-topic.

@RonnyPfannschmidt
Copy link
Contributor

did anyone check the performance on this? compared to current methods and #2093

@s-block
Copy link
Contributor Author

s-block commented Feb 23, 2017

I did a quick performance test on the session and the proposed code is slightly slower but not by much.

The test code was:

import cProfile
from datetime import datetime
import uuid

import flask
from flask.sessions import TaggedJSONSerializer


def test_session_performance():
    now = datetime.utcnow().replace(microsecond=0)
    the_uuid = uuid.uuid4()

    data = [
        (1, 2, 3),
        the_uuid,
        b'\xff',
        flask.Markup('Hello!'),
        now,
    ]

    cycles = 10000

    session = TaggedJSONSerializer()

    for d in data * cycles:
        dumped = session.dumps(d)
        session.loads(dumped)


if __name__ == '__main__':
    cProfile.run('test_session_performance()')

Latest flask master branch:
4.319 seconds
4.409 seconds
4.455 seconds

This branch:
4.891 seconds
4.913 seconds
4.957 seconds

@s-block
Copy link
Contributor Author

s-block commented Apr 26, 2017

Is this OK to be merged? It would be good to get this functionality in.

@davidism
Copy link
Member

davidism commented Apr 26, 2017

I'm working through the PR backlog right now, this is on the list. Would you mind enabling the "allow edits from maintainers" option in the sidebar?

@s-block
Copy link
Contributor Author

s-block commented Apr 26, 2017

No problem. Done.

def _untag_dict_used_with_key(self, the_value):
k, v = next(iteritems(the_value))
if self._was_dict_with_used_key(k):
return {k[:-2]: self._untag(v)}

This comment was marked as off-topic.

This comment was marked as off-topic.

"""
def _tag_dict_used_with_key(self, value):
k, v = next(iteritems(value))
return {'%s__' % k: v}

This comment was marked as off-topic.

@davidism
Copy link
Member

davidism commented Jun 1, 2017

I was reviewing this and decided it's not extensible enough. Added some review comments, but they're already fixed on my local branch.

New idea is to have a JSONTag interface, and give TaggedJSONSerializer a register method to add new tags. Then someone can do:

class TagOrderedDict(JSONTag):
    key = ' od'

    def check(self, value):
        return isinstance(value, OrderedDict)

    def to_json(self, value):
        return [(self.serializer.tag(k), self.serializer.tag(v)) for k, v in value]

    def to_python(self, value):
        return OrderedDict((self.serializer.untag(k), self.serializer.untag(v)) for k, v in value)

app.session_interface.serializer.register(TagOrderedDict)

@davidism
Copy link
Member

davidism commented Jun 1, 2017

Re: performance: new method is ~ 30 µs slower.

old: 70.7 µs ± 164 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
new: 96.3 µs ± 678 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

You'd need to handle 2,000,000 requests before you wasted 1 extra minute. There's always Flask-Session, etc. where server-side sessions are serialized with pickle, which is much faster.

@davidism
Copy link
Member

davidism commented Jun 1, 2017

Hmm, maybe this should all be moved to itsdangerous. It's not actually Flask specific.

@untitaker
Copy link
Contributor

I don't really see how it's got anything to do with itsdangerous either though

@davidism
Copy link
Member

davidism commented Jun 2, 2017

The only place it's being used is passed to itsdangerous. Eh, I could go either way, probably not worth the trouble.

@davidism davidism mentioned this pull request Jun 2, 2017
4 tasks
@davidism
Copy link
Member

davidism commented Jun 2, 2017

Continued in #2352. Thank you for working on this @s-block!

@davidism davidism closed this Jun 2, 2017
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants