Skip to content

Parser (or new function in parser module) should be able to output the parsed format #125

@pganssle

Description

@pganssle

There's a good amount of overhead in parsing a date compared to calling strptime or something of that nature. It seems that at least pandas calls dateutil.parser.parse in a tight loop. In a lot of cases, what happens is that you have a column of say 1000 dates that you want to turn into datetime objects, you don't know what the format is, but you know that it's always the same. Right now, the parser has to infer the type every time it parses the date. It would be nice if it could emit something that could either be parsed with strptime() (if that's possible), or we could provide a fast function that takes a timestr and a format (as output by the parser).

I think this could also provide additional flexibility to the parser module, since a lot of issues seem to arise from the fact that the parser itself is the only thing that knows what the format is, and if you want to do something unusual with that information, you have to modify the parsing function itself.

I think this can go hand-in-hand with #123 as part of an effort to make the parser a bit more modular and provided better public interfaces for dealing with date parsing.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions