Add time zone factories (tzutc, tzoffset) #504

pganssle · 2017-11-07T21:56:29Z

This is more or less the strategy I've come up with for handling singleton zones. It's fairly efficient from my profiling with tzoffset.

The next obvious candidate is tzlocal(), but I think that one is slightly trickier to handle, I'll make an issue for discussion.

jbrockmendel · 2017-11-08T15:54:10Z

I'm not totally clear what's happening here. Is the idea that you want multiple calls to tzutc() to return the same object?

pganssle · 2017-11-08T17:04:58Z

@jbrockmendel Yes, but not only for tzutc. For UTC, this actually does the same thing that #497 did, but does so slightly more efficiently (the old version would call __init__ on every tzutc() call, even though a new object did not need to be initialized).

As mentioned by @abalkin in #497, there are several fast paths in the code that are activated only when dt1.tzinfo is dt2.tzinfo, even if dt1.tzinfo == dt2.tzinfo. From what he's told me in the past, the intended implementation of tzinfo instances is that there is a single instance per zone. I think there are some performance benefits to heavily memoizing these things anyway, though the trade off is that it's harder to make it thread safe.

Generally it won't come up, but there's actually a slight difference between inter-zone and intra-zone comparison semantics that comes up when you start working with ambiguous times and fold:

>>> from dateutil import tz
>>> from datetime import datetime
>>> NYC = tz.gettz('America/New_York')
>>> dt1 = datetime(2011, 11, 6, 1, 30, fold=1, tzinfo=NYC)
>>> dt2 = datetime(2011, 11, 6, 1, 30, tzinfo=NYC)
>>> dt1 == dt2
True
>>> dt1 == dt2.replace(tzinfo=tz.gettz('America/New_York'))
False

This is because intra-zone comparisons are done by comparing the naive portion of the datetime (fold is not considered), but inter-zone comparisons are done by, effectively, converting both datetimes to UTC and comparing that. Since dt1 and dt2 represent the same wall time but different absolute times, dt1 == dt2 will be true for intra-zone but false for inter-zone comparisons (and the difference between intra- and inter- is whether dt1.tzinfo is dt2.tzinfo).

In the end, having factory metaclasses that provide constructors that return the same instance to represent the same zone will bring us more in line with the preferred tzinfo implementation.

abalkin · 2017-11-08T17:57:49Z

.. though the trade off is that it's harder to make it thread safe.

Why is the thread safety an issue with timezone objects? Aren't they immutable? At least the simple ones such as UTC and fixed offset timezone should certainly be immutable.

pganssle · 2017-11-08T18:07:37Z

@abalkin Normally yes, and certainly UTC and tzoffset will remain immutable, but one potential race condition even for the immutable classes (though I'm guessing with minor effects) at the creation of the initial object between checking if you need to create a new object and actually creating one.

For tzutc I think this won't be a problem because the first constructor call happens at import time when creating the UTC constant. For tzoffset I'm using setdefault, so I think the GIL will take care of the races there. I think a similar strategy can be employed for other tzinfo subclasses.

The bigger race condition problem I think will be if the more complex classes like tzfile are memoized. There's already a cache for tzicalvtz and I intend to add lru-style caches for the other time zone subclasses. The implementation there is more likely to have thread-safety problems (though, I imagine, with little consequence), and that would just be compounded if there were a single cache (as opposed to separate instances that may have been created on separate threads).

I don't think it's a real problem, just something to consider in the implementation.

pganssle · 2017-11-08T21:19:16Z

There doesn't seem to be any controversy about the particular strategy here, so I'm merging.

jbrockmendel · 2017-11-10T16:19:07Z

A little bit late to the party... the pattern here makes sense to me, but the threadsafety issue is bothersome. What is the failure mode here?

will bring us more in line with the preferred tzinfo implementation.

+1

pganssle · 2017-11-10T17:35:16Z

@jbrockmendel I suspect it would be a very unlikely confluence of things that would lead to anything resembling a bug from the way that this is currently implemented. Assuming that setdefault is thread-safe, there is no failure mode for tz.tzoffset. For tzutc there's basically no possibility of failure because the singleton is created at import time and import is thread-safe.

If we were to drop the tz.UTC construction in tz or someone were to build their own special hook that imports tz in a thread-unsafe way, then if there have been no previous calls to tz.tzutc() from any thread and one or more threads attempts to call tz.tzutc(), then there is a possible race condition here. Thread 1 could be about to call cls.__instance = super(_TzSingleton, cls).__call__() while thread 2 checks cls.__isinstance is None and gets True, and it would enter the inner scope of the if statement. At that point both threads have the following tasks to complete:

Construct a _TzSingleton and assign it to cls.__instance
Return cls.__instance

If they interleave [notation: (thread, step)] like so: (1, 1), (2, 1), (1, 2), (2, 2) or (2, 1) (1, 1) (2, 2), (1, 2), then there's no problem (whichever one finished the assignment first would construct an extra tzutc instance that would be deleted when its reference count gets to 0 on the completion of the second thread's assignment of cls.__instance), but if thread 1 or thread 2 completes both steps like (1, 1) (1, 2), (2, 1), (2, 2), then thread 1 and thread 2 will return two objects such that t1utc is not t2utc.

If for some reason the downstream consumer is relying on the property that tzutc() is tzutc() in all cases for some reason, that assumption would be invalidated if one of the tzutc instances is the one from the thread that finished the constructor first. All subsequent calls would still return the one from, say, thread 2. Otherwise the only consequence would be that datetime comparisons are considered inter- rather than intra- zone, which has no effect on the outcome for fixed-offset zones and might be slightly slower. It's similar to if one thread returned dateutil.tz.tzutc() and another one returned pytz.UTC except the two objects would also compare equal.

This can be fixed by just acquiring a lock around lines 10 and 11 in _TzSingleton, but given that every link in the chain that leads to a problem is incredibly unlikely, I think we're OK.

pganssle added 2 commits November 7, 2017 16:32

Switch tzutc over to using a singleton metaclass

a077d37

Add factory construction for tzoffset

54e4366

pganssle added enhancement time zones labels Nov 7, 2017

pganssle added this to the 2.7.0 milestone Nov 7, 2017

pganssle mentioned this pull request Nov 7, 2017

Implement dedicated isoparser #489

Merged

3 tasks

pganssle merged commit 64c6c85 into dateutil:master Nov 8, 2017

pganssle mentioned this pull request Mar 11, 2018

Release 2.7.0 #627

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add time zone factories (tzutc, tzoffset) #504

Add time zone factories (tzutc, tzoffset) #504

Uh oh!

pganssle commented Nov 7, 2017

Uh oh!

jbrockmendel commented Nov 8, 2017

Uh oh!

pganssle commented Nov 8, 2017

Uh oh!

abalkin commented Nov 8, 2017

Uh oh!

pganssle commented Nov 8, 2017

Uh oh!

pganssle commented Nov 8, 2017

Uh oh!

jbrockmendel commented Nov 10, 2017

Uh oh!

pganssle commented Nov 10, 2017 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add time zone factories (tzutc, tzoffset) #504

Add time zone factories (tzutc, tzoffset) #504

Uh oh!

Conversation

pganssle commented Nov 7, 2017

Uh oh!

jbrockmendel commented Nov 8, 2017

Uh oh!

pganssle commented Nov 8, 2017

Uh oh!

abalkin commented Nov 8, 2017

Uh oh!

pganssle commented Nov 8, 2017

Uh oh!

pganssle commented Nov 8, 2017

Uh oh!

jbrockmendel commented Nov 10, 2017

Uh oh!

pganssle commented Nov 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pganssle commented Nov 10, 2017 •

edited

Loading