Added methods (adjacent, gap, union) to TimePeriod and transform period_union#65
Added methods (adjacent, gap, union) to TimePeriod and transform period_union#65
Conversation
Codecov Report
@@ Coverage Diff @@
## master #65 +/- ##
=========================================
- Coverage 95.71% 94.52% -1.2%
=========================================
Files 32 31 -1
Lines 1308 1352 +44
Branches 208 218 +10
=========================================
+ Hits 1252 1278 +26
- Misses 25 37 +12
- Partials 31 37 +6
Continue to review full report at Codecov.
|
|
The test case I was talking about: events_union is And it contains incorrect duration data for the events. PTAL =) |
ahnlabb
left a comment
There was a problem hiding this comment.
Looks good! In general what do I expect of the event data when I do a period union? I suppose I only do it when I don't really care about the data inside each event (since the merged events only contain the actual data of the first event in an uninterrupted sequence of events (?)
|
|
||
| ip = e1_p.intersection(e2_p) | ||
| if ip: | ||
| # If events intersected, add event with intersected duration and try next event |
There was a problem hiding this comment.
I don't quite understand this comment. What event is added here?
|
|
||
| ip = ep.intersection(fp) | ||
| ip = e1_p.intersection(e2_p) | ||
| if ip: |
| return e | ||
|
|
||
|
|
||
| def _concurrent_eventpairs(events1, events2) -> Iterable[Tuple[Event, Event]]: |
There was a problem hiding this comment.
I understand the choice of not making this a _concurrent_eventpairs_as_well_as_their_intersection (too long 😄 ) function but if you were to yield ip as well lines 72-83 would cleanly reduce to one uncomplicated list comprehension.
concurrent_iter = _concurrent_eventpairs(events, filterevents)
return [_replace_event_period(e1, ip) for (e1, _, ip) in concurrent_iter]|
|
||
|
|
||
| def period_union(events1: List[Event], events2: List[Event]) -> List[Event]: | ||
| events = sorted(events1 + events2, key=lambda e: e.timestamp) |
There was a problem hiding this comment.
Maybe beyond the scope of this PR but why don't we just define __lt__ on events so that they are sorted on timestamps?
There was a problem hiding this comment.
It's weakly defined, but I'll do that!
|
@lundibundi You have overlapping events within each event list (e2, e3 in |
|
@ahnlabb oh, I thought it should handle them based on our discussion in ActivityWatch/aw-watcher-web#20, then I guess my implementation can be a separate function with this behavior. Also, could you please elaborate why should we 'merge events without gaps in between' won't it discard actual events(as it doesn't at least check event data equality before merging)? Maybe then this function should better be called smth like |
|
My function doesn't only merge adjacent events, it also merges events with any overlap leaving no overlapping events in the resulting list, which is the behavior wanted. Your function only removes exact duplicates (unlikely to ever occur in the use case given in ActivityWatch/aw-watcher-web#20), and does not give any guarantees that events are non-overlapping (which is what most other transforms expect from a given single eventlist). However, there might be an argument to be made for creating two separate functions, @lundibundi Since you were amazing enough to make a PR, I'll merge it into mine so you get some credit on https://activitywatch.net/contributors/ but I'll heavily edit it to fit into the above proposed solution. 🎉 |
…mentation, improved typing
…a in my period_union
|
@ErikBjare oh, I seem to have slightly misunderstood the use case, your solution is indeed better. Yeah, that's the intended behavior for it to preserve all non duplicate event's. Maybe it's worth introducing abstraction for 'list of time-periods' that's required to be sorted, then it'll be possible to avoid some overhead of multiple sorted calls/time-period creation and make clear separation between events and time-periods? Thanks for the credit 😊. I kind of didn't expect this fast of an implementation from contributors as it's usually somewhat slow in open source (at least in my experience). Thanks for the great work). |
| Example: | ||
| events1 | ======= ========= | | ||
| events2 | ------ --- -- ---- | | ||
| result | ----------- -- ========= | |
There was a problem hiding this comment.
Use all = or - in the example, gets confusing otherwise considering the warning
There was a problem hiding this comment.
I think the idea was to show which event will be stretched to the length of the other (or overwrite the overlapping event), but considering it strips all of the data it may be only confusing in here.
| return None | ||
|
|
||
| def adjacent(self, other: "TimePeriod") -> bool: | ||
| """Iff timeperiods are exactly next to each other, return True.""" |
There was a problem hiding this comment.
Actually intentional 😄
aw_core/timeperiod.py
Outdated
|
|
||
| def gap(self, other: "TimePeriod") -> Optional["TimePeriod"]: | ||
| """If periods are separated by a non-zero gap, return the gap as a new timeperiod, else None""" | ||
| if not (self.overlaps(other) or self.adjacent(other)): |
There was a problem hiding this comment.
Won't a simple if statement be enough here?
if self.end < other.start:
return TimePeriod(self.end, other.start)
elif other.end < self.start:
return TimePeriod(other.end, self.start)
else:
return NoneThere was a problem hiding this comment.
Yeah that would be cleaner! Thanks, fixed!
|
Merging this, should be good and not conflict with #57 |
No description provided.