I would like to write a function to process a list of integers, best way is to show as an example:
input [0,1,2,3, -1,-2,-3, 0,1,2,3, -1,-2,-3] will return [6,-6,6,-6]
I have a draft here that will actually work:
def group_pos_neg_list(nums):
p_nums = []
# to determine if the first element >=0 or <0
# create pos_combined and neg_combined as a list to check the length in the future
if nums[0] >= 0:
pos_combined, neg_combined = [nums[0]], []
elif nums[0] < 0:
pos_combined, neg_combined = [], [nums[0]]
# loop over each element from position 1 to the end
# accumulate pos num and neg nums and set back to 0 if next element is different
index = 1
while index < len(nums):
if nums[index] >= 0 and nums[index-1] >= 0: # both posivite
pos_combined.append(nums[index])
index += 1
elif nums[index] < 0 and nums[index-1] < 0: # both negative
neg_combined.append(nums[index])
index += 1
else:
if len(pos_combined) > 0:
p_nums.append(sum(pos_combined))
pos_combined, neg_combined = [], [nums[index]]
elif len(neg_combined) > 0:
p_nums.append(sum(neg_combined))
pos_combined, neg_combined = [nums[index]], []
index += 1
# finish the last combined group
if len(pos_combined) > 0:
p_nums.append(sum(pos_combined))
elif len(neg_combined) > 0:
p_nums.append(sum(neg_combined))
return p_nums
But I am not quite happy with it, because it looks a bit complicate.
Especially that there is a repeating part of code:
if len(pos_combined) > 0:
p_nums.append(sum(pos_combined))
pos_combined, neg_combined = [], [nums[index]]
elif len(neg_combined) > 0:
p_nums.append(sum(neg_combined))
pos_combined, neg_combined = [nums[index]], []
I have to write this twice as the final group of integers will not be counted in the loop, so an extra step is needed.
Is there anyway to simplify this?
Solution:
Using groupby
No need to make it that complex: we can first groupby the signum, and then we can calculate the sum, so:
from itertools import groupby
[sum(g) for _, g in groupby(data, lambda x: x >= 0)]
This then produces:
>>> from itertools import groupby
>>> data = [0,1,2,3, -1,-2,-3, 0,1,2,3, -1,-2,-3]
>>> [sum(g) for _, g in groupby(data, lambda x: x >= 0)]
[6, -6, 6, -6]
So groupby produces tuples with the “key” (the part we calculate with the lambda), and an iterable of the “burst” (a continuous subsequence of elements with the same key). We are only interested in the latter g, and then calculate sum(g) and add that to the list.
Custom made algorithm
We can also write our own version, by using:
swap_idx = [0]
swap_idx += [i+1 for i, (v1, v2) in enumerate(zip(data, data[1:]))
if (v1 >= 0) != (v2 >= 0)]
swap_idx.append(None)
our_sums = [sum(data[i:j]) for i, j in zip(swap_idx, swap_idx[1:])]
Here we first construct a list swap_idx that stores the indices where of the element where the signum changes. So for your sample code that is:
>>> swap_idx
[0, 4, 7, 11, None]
The 0 and None are added by the code explicitly. So now that we identified the points where the sign has changed, we can sum these subsequences together, with sum(data[i:j]). We thus use zip(swap_idx, swap_idx[1:]) to obtain two consecutive indices, and thus we can then sum that slice together.
More verbose version
The above is not very readable: yes it works, but it requires some reasoning. We can also produce a more verbose version, and make it even more generic, for example:
def groupby_aggregate(iterable, key=lambda x: x, aggregate=list):
itr = iter(iterable)
nx = next(itr)
kx = kxcur = key(nx)
current = [nx]
try:
while True:
nx = next(itr)
kx = key(nx)
if kx != kxcur:
yield aggregate(current)
current = [nx]
kxcur = kx
else:
current.append(nx)
except StopIteration:
yield aggregate(current)
We can then use it like:
list(groupby_aggregate(data, lambda x: x >= 0, sum))