Skip to content

Add CUMSUM operator to grdmath#5239

Merged
PaulWessel merged 11 commits intomasterfrom
sum-grdmath
Jun 6, 2021
Merged

Add CUMSUM operator to grdmath#5239
PaulWessel merged 11 commits intomasterfrom
sum-grdmath

Conversation

@PaulWessel
Copy link
Member

@PaulWessel PaulWessel commented May 20, 2021

Description of proposed changes

The new operator does as named: A B CUMSUM will read grid A and add up values cumulatively either along columns or rows. The actual results depends on the value of B which indicates what the consecutive nodes are:

  • B = +1 means sum rows from x-min to x-max.
  • B = -1 means sum rows from x-max to x-min.
  • B = +2 means sum columns from y-min to y-max.
  • B = -2 means sum columns from y-max to y-min.

The sums are reset to zero at the start of each new row of column. To carry the sum onto the next row or column requires adding 2 to the absolute value of B. Thus, B = 3 is the same as B = 1 except the sums at the end of a row is not reset, and B = -4 is the same as B = -2 except the sums at the end of a column are used when starting the next.

I have added four tests that explors ±1, ±2, ±3, and ±4:

B = +1 and +2:
sums_12p

B = -1 and -2:
sums_12m

B = +3 and +4:
sums_34p

B = -3 and -4:
sums_34m

We will wait and merge this PR until 6.2.0 has been released, hence WIP.

@PaulWessel PaulWessel added this to the Future release milestone May 20, 2021
@PaulWessel PaulWessel self-assigned this May 20, 2021
Copy link
Member

@maxrjones maxrjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. The operators need to be added to the table in doc/rst/source/grdmath.rst. I think both the descriptions in grdmath.c and grdmath.rst should specify that SUMROW is left to right and SUMCOL is bottom to top, since the column behavior is different than for similar functions, for example in MATLAB.

@PaulWessel
Copy link
Member Author

OK, I will udate the docs. We should have a discussion about the direction of SUMCOL. Since once can combine it with FLIPUD the direction can easily be changed. I was thinking it must made more sense to go in positive x and y directions. Opionions on this topic, @GenericMappingTools/core and @GenericMappingTools/gmt-contributors ?

@joa-quim
Copy link
Member

I don't understand the discussion. What does it matter if summation along columns ate bottom-up or top-down (same for rows)?

Numerical additions are not commutative and when numbers span across orders of magnitude there are algorithms that intend to be more accurate by adding first the small values to avoid them to fall into to the eps range of the large values. But we are not in that league.

@PaulWessel
Copy link
Member Author

True, but here order is important so we could not sum small values first since that would not yield a cumulative sum. I think as long as we are consistent and going in positive x and y directions then anyone who wants it backwards simply uses FLIPLR or FLIPUD prior to the corresponding SUM operator, no?

@maxrjones
Copy link
Member

Either is fine as long as it is documented. MATLAB and numpy both have cumulative sum methods that add top to bottom, but it is not required to follow them.

@joa-quim
Copy link
Member

Sorry, I read first posts in mail and didn't see the figures. That also answered another doubt I had. I think the operators should be called CUMSUMCOL and CUMSUMROW. Calling the SUM and then doing an accumulated sum is very confusing.

@PaulWessel
Copy link
Member Author

OK, I can see that. Note thought that in gmt math we have

`SUM        1  1    Cumulative sum of A`

That description is pretty clear I think. I settled on SUMCOL etc because then the operator will be listed near SUM and it says right there what they do. Users probably would search for SUM rather than something with cumulative to find these, no?

@joa-quim
Copy link
Member

Well, I'm speaking from myself. When I saw the description at https://github.com/GenericMappingTools/gmt/pull/5239/files/f8a02d4fd04a01dcf5a8ebca61cd7a68d85b431d..374f02452c47c1335ddd71eb9878e6f476b53387
I could not really understand if means a SUM or a CUMSUM.

It's true that it is a grdmath operator so it should produce a grid but that's not necessarily an obvious thought when we read the doc.

@PaulWessel
Copy link
Member Author

I agree that it is not obvious that SUMCOL by its name does a cumulative sum. However, the 1-liner documentation makes it abundantly clear. If you really want to rename it CUM* then we would also need to do so in gmtmath and allow for backwards compatibility of the SUM operator. I do not think users can infer the purpose of an operator just from its name. I bet nobody would know what LPDF does and there are lots like it. So the documentation is what is important for these, and the trade off between very long operator names and clarity of the docs, no?

@joa-quim
Copy link
Member

Ok, I don't think you'll want to add a true SUM operator because i that case it would be a real problem.

@PaulWessel
Copy link
Member Author

Hm, it is true that in gmt math there may be a need for a true sum. In fact, I think I had to fake this for my own use via MEAN N MUL.
The problem is that we cannot make that change backwareds compatible. If we add CUMSUM to gmtmath and change SUM to be a total then it will forever break the behavior for anyone using SUM as is...
Maybe we just bury SUM as undocumented and replace it with CUMSUM and a new called TOTAL?

@joa-quim
Copy link
Member

```ADD``?

@joa-quim
Copy link
Member

And, do you want to leave the CUBEs out?
And instead of 2 (or 3 operators), why not, like in Matlab (and others), just a CUMSUM plus a dimension? This would allow 1/-1 to sum up along positive or negative yy. The annoying thing with this is that one would need to decide how to number the dimensions. In Matlab and Julia, since they are column major, first dimension is columns but C is row-major. However, relatively few users are aware of this so we could follow Matlab.

@PaulWessel
Copy link
Member Author

```ADD``?

Already in use for decades.

@PaulWessel
Copy link
Member Author

And, do you want to leave the CUBEs out?

Yes, there is no way grdmath can be retrofitted to also deal with cubes without a complete rewrite. I am not keen on doing that.

And instead of 2 (or 3 operators), why not, like in Matlab (and others), just a CUMSUM plus a dimension? This would allow 1/-1 to sum up along positive or negative yy. The annoying thing with this is that one would need to decide how to number the dimensions. In Matlab and Julia, since they are column major, first dimension is columns but C is row-major. However, relatively few users are aware of this so we could follow Matlab.

That is a thought. We could do CUMSUM A B where we sum long given dimension (A) in the grid B, where A can be ±1 (pos/neg x-direction, i.e. row sums) or ±2 (pos/neg y-direction, i.e., column sums). In a right-handed coordinate system x would be 1 and y would be 2, regardless of how a matrix is stored internally. One could also imagine ±3 and ±4 to continue the cumulative sum across rows or columns and not reset after each row or column.

@joa-quim
Copy link
Member

```ADD``?

Already in use for decades.

Can't be overloaded? The other one requires 2 arguments, this would work with only one.

@PaulWessel
Copy link
Member Author

There is no way for any operator to guess the number of arguments it needs. They are all hard-wired. Add will simply pull two off the stack, it cannot know to pull just one since there may be 10 items on the stack. So no, there is no chance of overloading in that sense.

@joa-quim
Copy link
Member

Ok, my next suggestion would be to use a Portuguese word (SOMA) but I anticipate some resistance.

@PaulWessel
Copy link
Member Author

We can have a discussion if Norwegian (spoken by fewer people) should take precedence over Portuguese (spoken by more, but unintelligible to Norwegians).

@PaulWessel
Copy link
Member Author

Following @joa-quim's suggestion, I will implement CUMSUM A B as indicated above. However, wanted to get opinions on this: In gmtmath we have a SUM operator that actually does the cumulative sum for a dataset. In grdmath the SUM operator simply sets all nodes to the total. This is bad and inconsistent design, of course. Here is what I am proposing:

  1. Add the new CUMSUM A B operator to grdmath as discussed above
  2. Rename grdmath's SUM operator to TOTAL (with backwards compatibility)
  3. Rename gmtmath's SUM operator to CUMSUM (with backwards compatibility)
  4. Add new operator TOTAL to gmtmath

That way, old scripts that may use the inconsistently defined SUM in these two modules will still work, while going forward we have consistency in and capability to do both cumulative and total sums, plus grdmath can specify how it wants to cumulatively sum things.

OK with this plan guys?

@joa-quim
Copy link
Member

Just think that in CUMSUM A B, A should be the grid and B the dimension. Closer to the Matlab cumsum(Z, dim) construct.

@PaulWessel
Copy link
Member Author

Just think that in CUMSUM A B, A should be the grid and B the dimension. Closer to the Matlab cumsum(Z, dim) construct.

OK, I am fine with that.

@PaulWessel PaulWessel added the new feature PR that implements a new feature or capability in GMT label May 26, 2021
@PaulWessel PaulWessel changed the title WIP Add SUMCOL and SUMROW operators to grdmath WIP Add CUMSUM operator to grdmath May 26, 2021
@PaulWessel
Copy link
Member Author

I have updated the introductory comment to reflect reality and posted the 4 test results to show it is fully implemented now. We will wait for 6.2.0 to be released, but you may still approve the PR.

@PaulWessel PaulWessel changed the title WIP Add CUMSUM operator to grdmath Add CUMSUM operator to grdmath Jun 6, 2021
@PaulWessel PaulWessel merged commit da43d81 into master Jun 6, 2021
@PaulWessel PaulWessel deleted the sum-grdmath branch June 6, 2021 00:39
@maxrjones maxrjones added new core module feature PR that implements a new core module feature add-changelog Add PR to the changelog labels Jun 8, 2021
@seisman seisman removed this from the Future release milestone Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

add-changelog Add PR to the changelog new core module feature PR that implements a new core module feature new feature PR that implements a new feature or capability in GMT

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants