Skip to content

Added methods (adjacent, gap, union) to TimePeriod and transform period_union#65

Merged
ErikBjare merged 10 commits intomasterfrom
dev/union-period
Jul 2, 2018
Merged

Added methods (adjacent, gap, union) to TimePeriod and transform period_union#65
ErikBjare merged 10 commits intomasterfrom
dev/union-period

Conversation

@ErikBjare
Copy link
Copy Markdown
Member

No description provided.

@ghost ghost assigned ErikBjare Jun 26, 2018
@ghost ghost added the review label Jun 26, 2018
@codecov-io
Copy link
Copy Markdown

codecov-io commented Jun 26, 2018

Codecov Report

Merging #65 into master will decrease coverage by 1.19%.
The diff coverage is 76.31%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #65     +/-   ##
=========================================
- Coverage   95.71%   94.52%   -1.2%     
=========================================
  Files          32       31      -1     
  Lines        1308     1352     +44     
  Branches      208      218     +10     
=========================================
+ Hits         1252     1278     +26     
- Misses         25       37     +12     
- Partials       31       37      +6
Impacted Files Coverage Δ
aw_transform/__init__.py 100% <100%> (ø) ⬆️
aw_analysis/query2_functions.py 87.59% <40%> (-1.92%) ⬇️
aw_core/timeperiod.py 80.39% <58.33%> (-19.61%) ⬇️
aw_core/models.py 92% <64.28%> (-5.11%) ⬇️
aw_transform/filter_period_intersect.py 89.41% <87.14%> (+4.41%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c33ea2e...ffd5653. Read the comment docs.

@ErikBjare ErikBjare requested a review from ahnlabb June 26, 2018 10:05
@lundibundi
Copy link
Copy Markdown
Contributor

lundibundi commented Jun 26, 2018

The test case I was talking about:

e1 = Event(timestamp=now - timedelta(seconds=20), duration=timedelta(seconds=5))
e2 = Event(timestamp=now - timedelta(seconds=10), duration=timedelta(seconds=5))
e3 = Event(timestamp=now - timedelta(seconds=10), duration=timedelta(seconds=10))
e4 = Event(timestamp=now - timedelta(seconds=5), duration=timedelta(seconds=5))
e5 = Event(timestamp=now, duration=timedelta(seconds=10))
# union event lists with intersecting duplicates
events_union = period_union([e3, e2, e5], [e1, e3, e4, e5])

events_union is

{'id': None, 'timestamp': datetime.datetime(2018, 6, 26, 13, 2, 16, 717000, tzinfo=datetime.timezone.utc), 'duration': datetime.timedelta(0, 5), 'data': {}}
{'id': None, 'timestamp': datetime.datetime(2018, 6, 26, 13, 2, 26, 717000, tzinfo=datetime.timezone.utc), 'duration': datetime.timedelta(0, 10), 'data': {}}
{'id': None, 'timestamp': datetime.datetime(2018, 6, 26, 13, 2, 26, 717000, tzinfo=datetime.timezone.utc), 'duration': datetime.timedelta(0, 10), 'data': {}}
{'id': None, 'timestamp': datetime.datetime(2018, 6, 26, 13, 2, 31, 717000, tzinfo=datetime.timezone.utc), 'duration': datetime.timedelta(0, 15), 'data': {}}
{'id': None, 'timestamp': datetime.datetime(2018, 6, 26, 13, 2, 36, 717000, tzinfo=datetime.timezone.utc), 'duration': datetime.timedelta(0, 10), 'data': {}}

And it contains incorrect duration data for the events. PTAL =)

Copy link
Copy Markdown

@ahnlabb ahnlabb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! In general what do I expect of the event data when I do a period union? I suppose I only do it when I don't really care about the data inside each event (since the merged events only contain the actual data of the first event in an uninterrupted sequence of events (?)


ip = e1_p.intersection(e2_p)
if ip:
# If events intersected, add event with intersected duration and try next event
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this comment. What event is added here?


ip = ep.intersection(fp)
ip = e1_p.intersection(e2_p)
if ip:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be necessary

return e


def _concurrent_eventpairs(events1, events2) -> Iterable[Tuple[Event, Event]]:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the choice of not making this a _concurrent_eventpairs_as_well_as_their_intersection (too long 😄 ) function but if you were to yield ip as well lines 72-83 would cleanly reduce to one uncomplicated list comprehension.

concurrent_iter = _concurrent_eventpairs(events, filterevents)
return [_replace_event_period(e1, ip) for (e1, _, ip) in concurrent_iter]



def period_union(events1: List[Event], events2: List[Event]) -> List[Event]:
events = sorted(events1 + events2, key=lambda e: e.timestamp)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe beyond the scope of this PR but why don't we just define __lt__ on events so that they are sorted on timestamps?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weakly defined, but I'll do that!

@ahnlabb
Copy link
Copy Markdown

ahnlabb commented Jun 26, 2018

@lundibundi You have overlapping events within each event list (e2, e3 in events1 and e3, e4 in events2) should that be valid? It does not fit with the use case @ErikBjare described to me but maybe it should be covered as well.

@lundibundi
Copy link
Copy Markdown
Contributor

@ahnlabb oh, I thought it should handle them based on our discussion in ActivityWatch/aw-watcher-web#20, then I guess my implementation can be a separate function with this behavior.

Also, could you please elaborate why should we 'merge events without gaps in between' won't it discard actual events(as it doesn't at least check event data equality before merging)? Maybe then this function should better be called smth like periods_merge_adjacent?

@ErikBjare
Copy link
Copy Markdown
Member Author

ErikBjare commented Jun 27, 2018

My function doesn't only merge adjacent events, it also merges events with any overlap leaving no overlapping events in the resulting list, which is the behavior wanted.

Your function only removes exact duplicates (unlikely to ever occur in the use case given in ActivityWatch/aw-watcher-web#20), and does not give any guarantees that events are non-overlapping (which is what most other transforms expect from a given single eventlist).

However, there might be an argument to be made for creating two separate functions, union (which would simply return the union of the two eventslists, removing duplicates but keeping overlap) and period_union/merge_overlapping (which would remove all overlap by merging overlapping events).

@lundibundi Since you were amazing enough to make a PR, I'll merge it into mine so you get some credit on https://activitywatch.net/contributors/ but I'll heavily edit it to fit into the above proposed solution. 🎉

@lundibundi
Copy link
Copy Markdown
Contributor

@ErikBjare oh, I seem to have slightly misunderstood the use case, your solution is indeed better.

Yeah, that's the intended behavior for it to preserve all non duplicate event's.

Maybe it's worth introducing abstraction for 'list of time-periods' that's required to be sorted, then it'll be possible to avoid some overhead of multiple sorted calls/time-period creation and make clear separation between events and time-periods?

Thanks for the credit 😊. I kind of didn't expect this fast of an implementation from contributors as it's usually somewhat slow in open source (at least in my experience). Thanks for the great work).

Example:
events1 | ======= ========= |
events2 | ------ --- -- ---- |
result | ----------- -- ========= |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use all = or - in the example, gets confusing otherwise considering the warning

Copy link
Copy Markdown
Contributor

@lundibundi lundibundi Jun 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea was to show which event will be stretched to the length of the other (or overwrite the overlapping event), but considering it strips all of the data it may be only confusing in here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, done!

return None

def adjacent(self, other: "TimePeriod") -> bool:
"""Iff timeperiods are exactly next to each other, return True."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo 'Iff'.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


def gap(self, other: "TimePeriod") -> Optional["TimePeriod"]:
"""If periods are separated by a non-zero gap, return the gap as a new timeperiod, else None"""
if not (self.overlaps(other) or self.adjacent(other)):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't a simple if statement be enough here?

if self.end < other.start:
    return TimePeriod(self.end, other.start)
elif other.end < self.start:
    return TimePeriod(other.end, self.start)
else: 
    return None

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that would be cleaner! Thanks, fixed!

@ErikBjare
Copy link
Copy Markdown
Member Author

Merging this, should be good and not conflict with #57

@ErikBjare ErikBjare merged commit 116923b into master Jul 2, 2018
@ghost ghost removed the review label Jul 2, 2018
@ErikBjare ErikBjare deleted the dev/union-period branch July 2, 2018 11:37
@ErikBjare ErikBjare restored the dev/union-period branch September 18, 2019 08:13
@ErikBjare ErikBjare deleted the dev/union-period branch September 18, 2019 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants