Skip to content

Returned datetime skips a day with time+timezone input and PREFER_DATES_FROM = 'future' #403

@Laogeodritt

Description

@Laogeodritt

Versions:

  • Linux (Debian testing)
  • Python 3.6.4
  • dateparser (0.7.0)
    • python-dateutil (2.7.2)
    • pytz (2018.3)
    • regex (2018.2.21)
    • tzlocal (1.5.1)

What I was trying to do: Use dateparser to:

  • Return a timezone-unaware datetime, with the value converted to UTC
  • Default to UTC, if the input string does not have a timezone
  • Convert the time in the input string to UTC, if the input string does specify a timezone

I think my settings here should achieve this according to my understanding of the documentation—if not, and this isn't a bug but a config error on my part, please do let me know!

Minimum working example:

With the following settings, behaviour is as expected:

>>> DATEPARSER_SETTINGS
{'TIMEZONE': 'UTC', 'TO_TIMEZONE': 'UTC', 'RETURN_AS_TIMEZONE_AWARE': False}
>>> parse('6PM UTC', settings=DATEPARSER_SETTINGS)  # case #1
datetime.datetime(2018, 3, 27, 18, 0)
>>> parse('6PM UTC-4', settings=DATEPARSER_SETTINGS)  # case #2
datetime.datetime(2018, 3, 27, 22, 0)
>>> parse('2PM UTC', settings=DATEPARSER_SETTINGS)  # case #3
datetime.datetime(2018, 3, 27, 14, 0)
>>> parse('2PM UTC-4', settings=DATEPARSER_SETTINGS)  # case #4
datetime.datetime(2018, 3, 27, 18, 0)
>>> datetime.utcnow()
datetime.datetime(2018, 3, 27, 17, 15, 52, 695093)

With the following settings, some unexpected behaviour occurs:

>>> DATEPARSER_SETTINGS
{'TIMEZONE': 'UTC', 'TO_TIMEZONE': 'UTC', 'RETURN_AS_TIMEZONE_AWARE': False, 'PREFER_DATES_FROM': 'future'}
>>> parse('6PM UTC', settings=DATEPARSER_SETTINGS)  # case #1 - OK, later today
datetime.datetime(2018, 3, 27, 18, 0)
>>> parse('6PM UTC-4', settings=DATEPARSER_SETTINGS)  # case #2 - OK, later today
datetime.datetime(2018, 3, 27, 22, 0)
>>> parse('2PM UTC', settings=DATEPARSER_SETTINGS)  # case #3 - this is fine, 2PM UTC today is in the past so skip to tomorrow
datetime.datetime(2018, 3, 28, 14, 0)
>>> parse('2PM UTC-4', settings=DATEPARSER_SETTINGS)  # case #4               !!! 2PM UTC-4 is in ~40 minutes, this skips that time and instead yields tomorrow
datetime.datetime(2018, 3, 28, 18, 0)
>>> datetime.utcnow()
datetime.datetime(2018, 3, 27, 17, 13, 50, 149498)

Expected behaviour: With an undated time input and PREFER_DATES_FROM=future, dateparse should return the nearest future time that matches, i.e., datetime(2018, 3, 27, 18, 0).

What actually happens: Dateparser skips a day to the second nearest date in the future, datetime(2018, 3, 28, 18, 0). This only happens in the last case above, which to me suggests the following:

  • The timezone in the input string and the timezone in the settings are different
  • The time, if in the settings timezone, would be in the past. That is to say, 2PM UTC is in the past here (so dateparser adds +1 day to satisfy the PREFER_DATES_FROM=future condition).
  • The time with the timezone given in the input string is actually a few hours in the future, i.e., 2PM UTC-4 is in the future, so adding +1 day is erroneous

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions