Skip to content

Remove arrow, transition to attrs and add Timespan#222

Merged
C4ptainCrunch merged 24 commits intoics-py:masterfrom
N-Coder:new-timespan-impl
Mar 21, 2020
Merged

Remove arrow, transition to attrs and add Timespan#222
C4ptainCrunch merged 24 commits intoics-py:masterfrom
N-Coder:new-timespan-impl

Conversation

@N-Coder
Copy link
Copy Markdown
Member

@N-Coder N-Coder commented Feb 2, 2020

This implementation collects all time-related attributes of an event - begin, end/duration and precision - in a single, isolated and easy to test class. It also targets the three possible cases for time values - timezone-aware datetimes, floating datetimes (which are currently unsupported, for details see further down) and all-day dates (which are currently pretty buggy). Additionally to not doing anything implicitly (neither any fancy value conversion nor fancy indirect property access), it viciously guards the validity of its internal state. The validity of each of the 4 values representing the event time is highly depending upon the other values. Allowing mutation opens up easy ways of breaking validity - so the implementation is actually immutable, with each modification creating a new object and requiring a full revalidation of all 4 values together. As the times shouldn't be modified that often, having the optimization of mutability shouldn't be strictly necessary. Additionally, moving the functionality to its own class makes testing easier and keeps this quite complicated functionality clean from other concerns. All this makes the timespan handling easier to implement correctly, but unfortunately not easier to use. So I also started modifying the Event class to handle the interfacing in an user-friendly way and hide the complexity of timespans.

This is by far not completely integrated in the rest of the library, as I first wanted to see where this approach might lead us and now also wanted to collect feedback. See below on how we might continue from hereon.

Where did Arrow go?

I first started out with the realisation that floor('day') doesn't suffice for date values, as we need to get rid of timezone:

morning_1 = arrow.Arrow(2020, 1, 19, 8, 0, tzinfo="+01:00")
morning_3 = arrow.Arrow(2020, 1, 19, 8, 0, tzinfo="+03:00")
print(morning_1, morning_1.floor("day"))
# > 2020-01-19T08:00:00+01:00 2020-01-19T00:00:00+01:00
print(morning_3, morning_3.floor("day"))
# > 2020-01-19T08:00:00+03:00 2020-01-19T00:00:00+03:00
print(morning_1.floor("day") == morning_3.floor("day"))
# > False

Unsetting the timezone does strange things, as the constructor always interpolates tzinfo=None to UTC:

midnight_None = arrow.Arrow(2020, 1, 19)
midnight_None.tzinfo = None
print(midnight_None, midnight_None.clone(), midnight_None.floor("day"))
# > 2020-01-19T00:00:00 2020-01-19T00:00:00+00:00 2020-01-19T00:00:00+00:00
print(midnight_None == midnight_None.clone())
# > False
print(midnight_None == midnight_None.floor("day"))
# > False

Trying to use arrow for naive (i.e. floating) timestamps by keeping tzinfo set to None is thus very hard. So we would need another variable to keep track of whether the tzinfo is intentionally set or should actually be None. This would require a lot of special casing, e.g. when comparing a timestamp with timezone +3 with one with UTC, as we need to check whether UTC actually means floating.

To summarize, I'm not quite happy with the automatic timezone mangling that arrow does. It might be useful to do things better when you don't worry about timezones, but here we have to actually get timezones right ourselves without something automagically mangling them. Python's built-in datetime might be annoying, because it is a lot harder to use right and tends to throw exceptions (e.g. when comparing timezone-aware datetimes with timezone-naive/floating datetimes). But in those special cases guessing right automatically is hard. Additionally, these exception point us at places where we might need to handle things explicitly - e.g. when mixing timezone-aware datetimes, floating datetimes and all-day dates that don't have a timezone.

As a result, for me, the selling points of arrow no longer outweigh its deficits. Too many modules and types is not an argument here, as this should be a library and not some prototype code that you are typing into the interpreter shell. Doing conversions automatically and not properly handling naive timestamps is the root cause for the above problems. And all the other features could also be implemented on top of plain old datetimes - actually the Arrow class is only a thin wrapper around the datetime class, adding some functions but no further information (!). As a result, my timespan implementation only uses the plain, old, but reliable datetime class.

Further steps

So, I tried to do a timespan implementation based on datetime and tried to cover all edge-cases I could come up with. For those that I didn't think of, it should trow an exception so that we can think of the proper way of handling it once it comes up, instead of it vanishing through some magic conversion resulting in probably wrong results.

This should somewhat follow the zen of python (see import this)

Explicit is better than implicit.
[...]
Errors should never pass silently.
[...]
In the face of ambiguity, refuse the temptation to guess.

And now finally make_all_day looks good - see the quite extensive test for this function. Additionally, this functionality would now allow floating times properly, as they are not that different from dates. Actually, it should fix all the issues referenced in #155 (comment). If we want to go this way, we would probably need to make changes in a few other places and think about breakages in the public API. But, this would also allow us to finally properly conserve floating events and VTIMEZONES when parsing and later exporting an .ics file. What also still needs testing is handling of events with different begin and end time zone - e.g. plane flights. But I'm confident that those will also work reliably, given that we are able to define the right expectations and guarantees we have/want in these cases.

@C4ptainCrunch what's your opinion on this? Should we try to continue on this way, potentially removing arrow support in many places for the sake of correctness? I'd love to finally see this working properly and I could try to get a complete version with this new functionality working, if you also think that this is the way to go.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for arrow 0.15 Automatic support for DTSTAMP

3 participants