Skip to content

yaml implicit resolver leak #67

@xqliang

Description

@xqliang

Every time when we call yaml.dump, it will call the resolve method:

    def resolve(self, kind, value, implicit):
        if kind is ScalarNode and implicit[0]:
            if value == u'':
                resolvers = self.yaml_implicit_resolvers.get(u'', [])
            else:
                resolvers = self.yaml_implicit_resolvers.get(value[0], [])
            resolvers += self.yaml_implicit_resolvers.get(None, [])

As show above, the last line resolvers += self.yaml_implicit_resolvers.get(None, []) in resolve will change the yaml_implicit_resolvers inplace by accident, making yaml_implicit_resolvers size become more and more bigger, the yaml.dump will slow down, and finally use up 100% CPU.

reproduce:

import re

import yaml


def test_implicit_resolver_leak():
    # Add any custom resolver
    tag, regexp = '!any_resolver', re.compile('AnyResolver')
    yaml.add_implicit_resolver(tag, regexp)

    none_resolvers = yaml.Dumper.yaml_implicit_resolvers.get(None, [])
    assert (tag, regexp) in none_resolvers

    old_f_resolvers = yaml.Dumper.yaml_implicit_resolvers.get('f', [])
    assert (tag, regexp) not in old_f_resolvers

    # Dump at least one ScalarNode
    yaml.dump(False)

    new_f_resolvers = yaml.Dumper.yaml_implicit_resolvers.get('f', [])
    assert (tag, regexp) in new_f_resolvers  # leak here

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions