running_median() with windowed data#1040
Conversation
|
For the record, the issue with >>> from decimal import Decimal
>>> from math import pi
>>> Decimal(pi)
Decimal('3.141592653589793115997963468544185161590576171875')
>>> - - Decimal(pi)
Decimal('3.141592653589793115997963469')
>>> Decimal(pi) == - - Decimal(pi)
FalseSo a value that goes into the negated |
+0 for as-is on this one.
+1 on the current implementation - it's reasonably clear what's going on, and the other options are probably overkill for this general purpose library. |
|
Okay, I think we're good to go. |
|
This is great, thanks - can't wait to find a place to use it. |
Open questions:
iterableparameter todata, consistent withstatistics.median? Or keep as-is to matchmore_itertoolsconventions and to emphasize the lazy evaluation which is the principal use case forrunning_median()?O(n)but mostly fast steps to maintain a sorted window? Or add more complexO(log n)code (IndexableSkiplist, blist, SortedContainers, etc)? Based on Grant Jenks' notes, the current list insort/bisect/del technique can be expected to win for window sizes up to several thousand.Solved questions:
maxlenbecause it caps the size of the window and also allows smaller sizes, like themaxlenparameter for deque.running_medianstart yielding values before the window is full. This is more convenient to use. Also, it invariant that the unwindowed case gives the same result as having a window larger than the input data.