Fix #4889: for...range loop condition #4891

zdenko · 2018-02-08T22:05:58Z

Fixes #4889.
It seems that #4853, besides preventing an infinite loop, also caused a breaking change in the existing code.
The condition which checks the upper and lower range bounds, in order to prevent the infinite loop, always allows at least one iteration of the for loop.
Example:

a = 2
b = 1
s = 1
r = (x for x in [a..b] by s)

The value of r, prior to #4853 was [], but now it's [2].

To me [] is a bug as I would expect the result to be [2].
Given that r = (x for x in [1..2] by 1000) is [1], r = (x for x in [1..2] by -1000) should give the same result.

Anyway, to correct the breaking change caused by #4853, I prepared this PR which reverts the output.
And, I guess whether the result of for x in [1..2] by -1 should be []or [1] belongs to a separate discussion and PR.

vendethiel · 2018-02-08T22:07:24Z

Ok, that code looks much better. Nicely done.

GeoffreyBooth · 2018-02-08T22:13:19Z

Let’s add a test that matches #4889. Instead of logging the values, just push them into an array and then compare the array with what you’d expect it to be.

zdenko · 2018-02-09T14:02:30Z

Although the intention of this PR is (was) to revert the output of for..range by loop prior to #4853, I have second thoughts about it.

I think this is the minimal case which shows the difference before and after #4853.

r = (x for x in [2..1] by 1)

As said earlier, the result of r before #4853 was [], and after [2].
Compared to other examples:

r = (x for x in [2..1] by -1000)  # r = [2]
r = (x for x in [2..2] by 1000)   # r = [2]

The reason for x in [2..1] by 1 returns [] lies in the compiled condition which checks the upper bound value based on the value of by, and doesn't take in account value of the lower bound.

# CS 2.1.1 and below

for x in [2..1] by 100
# for (x = i = 2; i <= 1; x = i += 100)
# []

for x in [2..1] by -100
# for (x = i = 2; i >= 1; x = i += -100)
# [2]

for x in [2..2] by 100
# for (x = i = 2; i <= 2; x = i += 100)
# [2]

for x in [2..2] by -100
# for (x = i = 2; i >= 2; x = i += -100)
# [2]

The condition part i <= or i >= depends on the value of the by.
I think this is wrong, and the result from the first example should also be [2].
The range used in for loop (it's not compiled into the array as in a = [1..2]) determines the lower and upper bounds and the condition should always check both limits.
So, for x in [a..b] by s in pseudocode should be

for (x = a; x bewteen a and b; x += s)

GeoffreyBooth · 2018-02-11T19:19:18Z

Going back to the example in #4889, modified slightly so that it could become a test:

results = []
n = 1

for i in [0..n]
  results.push 'i = ' + i
  for j in [(i+1)..n]
    results.push 'j = ' + j

This produces [ 'i = 0', 'j = 1', 'i = 1', 'j = 2', 'j = 1' ] in this branch, 2.1.1, 2.0.3, and 1.12.7. All well and good.

Now take the same as above, but add by 1 to the second for:

results = []
n = 1

for i in [0..n]
  results.push 'i = ' + i
  for j in [(i+1)..n] by 1
    results.push 'j = ' + j

Then we get:

[ 'i = 0', 'j = 1', 'i = 1', 'j = 2', 'j = 1' ] in this branch
[ 'i = 0', 'j = 1', 'i = 1', 'j = 2' ] in 2.1.1
[ 'i = 0', 'j = 1', 'i = 1' ] in 2.0.3 and 1.12.7

The issue raised in #4889 was that by 1 shouldn’t make a difference. And that certainly seems reasonable, it shouldn’t. So assuming the output of the first example is correct, this PR fixes the bug that by 1 should be a noop.

It’s a separate question, which I think @zdenko is alluding at in his last comment, as to whether the output from the first example is correct. But I think we should treat that as a separate bug. Could we get one PR that makes the existence or nonexistence of by 1 irrelevant, and a separate issue/PR to discuss what the “correct” output should be in cases like the first example?

zdenko · 2018-02-13T12:55:04Z

@GeoffreyBooth my comment about the correctness of the output was regarding the second example, e.g. for - range - by.

Besides fixing infinite loop, #4853 also caused a breaking change of the for - range - by loop compilation, but I don't think this change is a bug.
And, I think that the result ([ 'i = 0', 'j = 1', 'i = 1' ]) user from #4889 expects is wrong.

Options:
a) we merge this PR which will revert the output of for - range - by loop, and then create another PR to discuss changes in the output.
b) discard this PR and keep changes from the #4853 (with minor improvements in a separate PR)

I vote for b).

Why the different result?

The results from for x in [a..b] and for x in [a..b] by 1] are different because for - range and for - range - by loops compiles differently.

for x in [a..b]

No (significant) changes before and after #4853:

for (x = a; a <= b ? x <= b : x >= b; a <= b ? ++x; --x)

The value of increment/decrement step is defined by the relation of a and b.

for x in [a..b] by s

Before #4853:

for (x = a; s > 0 ? x <= b : x >= b; x += s)

The loop condition basically says if the step is larger then 0 it also means that the end index ('b') of the range is larger then the start index (a).

After #4853:

for (x = a; a <= b ? a <= x <= b : a >= x >= b; x += s)

The loop condition is changed: x must be between a and b

The original intention of this change was to prevent possible infinite loops. Back then I wasn't paying additional attention to the output (although I should), as all existing tests have passed.
But, using the value of 'by' to define loop condition seems wrong to me. The value of 'by` is an explicit increment/decrement step set by the user.

Here is the table that shows the difference between the for - range - by from the second example:

i	j	range	condition before	condition after	result before	result current
0	1	[1..1] by 1	j <= 1	1 <= j <= 1	[i = 0, j = 1]	[i = 0, j = 1]
1	2	[2..1] by 1	j <= 1	2 >= j >= 1	[i = 0, j = 1, i = 1]	[i = 0, j = 1, i = 1, j = 2]
1	3	[2..1] by 1		2 >= j >= 1		[i = 0, j = 1, i = 1, j = 2]

GeoffreyBooth · 2018-02-13T18:31:38Z

I don’t have time right this minute to dig into this, but do you mind opening a PR that’s limited to making by 1 and the lack of by equivalent? I think we can all agree that that is a bug that should be fixed, and we should get that merged in.

As for changing/correcting the interpretation of by, I think I’ll probably agree with you, but I need to work through some examples first. This is complicated enough that I think it would benefit from its own PR for focus/discussion. That could maybe be this PR, if you want to split out the by stuff to another one.

zdenko · 2018-02-13T20:13:35Z

Agree. I'll prepare a couple of tests for this one and create another PR for changing the interpretation of by.

GeoffreyBooth · 2018-02-18T01:26:02Z

@zdenko If you don’t mind, let’s wrap up this work (both fixing the by 1 inconsistency and any change to how ranges are handled, as a separate PR) so that I can get another bugfix patch release out. Once that’s out, we can merge in some of the new features/changed output PRs, which should trigger a new minor release (2.3.0).

zdenko · 2018-02-18T06:21:37Z

@GeoffreyBooth I added a simple test, a code from the example.
Will you merge this into master, and I'll start another PR to deal with range by inconsistency.

GeoffreyBooth · 2018-02-18T07:00:00Z

test/control_flow.coffee

+  result = []
+  for i in [0..n]
+    result.push i
+    for j in [(i+1)..n] by 1


Let’s either have both for loops say by 1 or have neither do so.

If it makes a difference, in the other PR maybe we can copy this test with the other case (like one test with no by, and the other test with two by 1s).

Without by 1 the result will be [ 0, 1, 1, 2, 1 ]. I used the same code as the user in #4889 to show the final (expected) result. Adding by 1 to the first loop will make no difference, but it's not actually the point of this PR (or #4889).
The second loop range is [1..1] in the first iteration and [2..1] in the second, and this is the cause of different results if by 1 is used or not.
The purpose of this PR was to revert the compilation of for - range - by so the result would be the same as before #4853.

I don’t think we need a PR that produces the result the user expects, if we think that that result is wrong. I thought the plan was to have two PRs:

one so that by 1 and a lack of by are always equivalent

one that fixes the range output to be what you think it should be

Oh, sorry, my bad.
In that case, if you agree, I suggest just one PR which will cover both issues, as changed range output is prerequisite for correct by 1 and w/o by output.

GeoffreyBooth · 2018-02-18T11:08:30Z

Thinking out loud, starting with my example above:

results = []
n = 1

for i in [0..n]
  results.push 'i = ' + i
  for j in [(i+1)..n]
    results.push 'j = ' + j

This could be rewritten as:

results = []
for i in [0..1]
  results.push 'i = ' + i
  for j in [(i+1)..1]
    results.push 'j = ' + j

Now is where things get tricky. So the first range is 0..1, and so the first for loops twice, once where i is 0 and once where i is 1 (remember that .. is inclusive).

So there are two second for loops created, once for each i. Those two ranges are 1..1 and 2..1. There’s no by here, so what are we to do for 2..1? Nothing at all, since a lack of by implies by 1 but we can’t count up here? Or do we see that the start is greater than the end, and imply by -1 instead?

The docs actually have an answer to this question. The example for ranges is this:

countdown = (num for num in [10..1])

Which returns [10,9,8,7,6,5,4,3,2,1], both now and in 1.12.7. So clearly the expectation is that we check whether we should count up or down, and choose by 1 or by -1 as appropriate.

So to unpack the loops:

results = []
i = 0
results.push 'i = ' + i
for j in [1..1]
  results.push 'j = ' + j
i = 1
results.push 'i = ' + i
for j in [2..1]
  results.push 'j = ' + j

And finally:

results = []
i = 0
results.push 'i = ' + i
j = 1
results.push 'j = ' + j
i = 1
results.push 'i = ' + i
j = 2
results.push 'j = ' + j
j = 1
results.push 'j = ' + j

And so we get [ 'i = 0', 'j = 1', 'i = 1', 'j = 2', 'j = 1' ], which my “by hand” calculations determines to be what I would assume is the “correct” answer.

So I guess what I was assuming in earlier posts was wrong: a lack of by doesn’t always mean that there’s an implied by 1, but rather that there’s an implied by 1 or by -1, depending on if the start of the range is greater or less than the end of the range. (If the start and end are equal, it doesn’t matter what the by value is, because the range returns just that one value.)

But what are we supposed to do with a range of [2..1] and by 1? You can’t count up from 2 to 1. In 1.12.7, [10..1] by 1 is an empty array. That feels correct to me. In 2.2.1, it’s [10]—as in, it does one loop first, then increments up and then checks that it’s still within range. I can see the logic in this interpretation too.

Since both interpretations of [10..1] by 1 are reasonable, I think we should stay with the 1.x handling unless there’s a persuasive case to be made for why it’s wrong. Otherwise the new output is just an arbitrary breaking change. Surely it must be possible to avoid infinite loops while preserving the empty-array output of [10..1] by 1 that we got in 1.x.

zdenko · 2018-02-18T14:51:08Z

Surely it must be possible to avoid infinite loops while preserving the empty-array output of [10..1] by 1 that we got in 1.x.

@GeoffreyBooth this PR does exactly that. All tests from #4853 are included, I just corrected the empty array result in cases like [2..1] by 1.

Since both interpretations of [10..1] by 1 are reasonable, I think we should stay with the 1.x handling

I think the opposite. Here is simplified version of how for x in [a..b] by s is compiled in nodes:

for (x = a; s > 0 ? x <= b : x >= b; x += s)

The part s > 0 feels wrong to me. Step s in explicit increment/decrement set by the user and shouldn't be used in the condition part.
Let's see what happens if we use numbers:

for x in [2..1] by 1
# for (x = 2; 1 > 0 ? x <= 1 : x >= 1; x += 1)

The result is an empty array because x <= 1 (e.g. 2 <= 1) fails. It seems logic at first as you can't go up from 2to 1 by 1.

for x in [2..2] by 1
# for (x = 2; 1 > 0 ? x <= 2 : x >= 2; x += 1)

The result in this case will be [2]. So, here we say go up from 2 to 2 by 1 and we'll get [2]

for x in [2..1] by -100
# for (x = 2; -100 > 0 ? x <= 1 : x >= 1; x += -100)

This reads as go down from 2 to 1 by -100. The result is [2].

It's probably just me, but I read for x in [a..b] as "go from a to b" and for x in [a..b] by s as "go from a to b with step of s".
So, the only difference is the step. The former will be implicitly set by CS, the latter by the user.
If by is used as the condition, then it actually means "if s is positive go up to b, otherwise go down to b". The a is disregarded in this case, and IMHO it shouldn't be.
The loop starts with a and proceeds to b by given step:

for x in [2..1] by 1    # = [2]
for x in [2..1] by -100 # = [2]
for x in [2..2] by 1    # = [2]
for x in [2..1] by 0    # = [2]

I know this is a breaking change, but I'm totally OK if the current output stays.

GeoffreyBooth · 2018-02-18T16:53:45Z

I don't mean that we should keep the output of 1.x, of comparing the step against zero. I just mean that [10..1] by 1 should still return an empty array, by however method makes sense.

zdenko · 2018-02-18T22:08:52Z

Yes, I get that. I wanted to explain why the result shouldn't be an empty array.
But, let's wrap this up. This PR provides the correct result (e.g. empty array) and prevents an infinite loop.
I say we merge this.

GeoffreyBooth · 2018-02-20T00:01:38Z

@zdenko I’ve added a few more tests. Do they look correct to you?

Assuming they’re good, I’m okay with merging this in. Are there still additional changes you want to make to how ranges/by operate, that you’ll be opening a new PR for?

zdenko · 2018-02-20T08:37:32Z

Tests are correct. I'll wait with the PR for range-by for now. It would be a breaking change to the existing code. So, maybe in CS3.

GeoffreyBooth · 2018-02-20T08:46:13Z

Thanks. Maybe open a new issue explaining it, with a little less focus on what the JavaScript output would be and a little more emphasis on what the result (empty array, array with certain values, etc.) would be. If you can explain why the current output is wrong, and others agree, it’s something we can change in 2.x; if the current output is reasonable but not what a more thoughtful approach would be, it would be good to explain what the better result would be and why, and why we should change to it.

fix jashkenas#4889

753c560

GeoffreyBooth added the change label Feb 18, 2018

zdenko added 2 commits February 18, 2018 07:07

test

1ad383a

merge master into the branch

77b9002

GeoffreyBooth requested changes Feb 18, 2018

View reviewed changes

move test from 'control_flow' to 'ranges'

758ffce

More range tests

2118c99

GeoffreyBooth approved these changes Feb 20, 2018

View reviewed changes

GeoffreyBooth merged commit 72ab6fe into jashkenas:master Feb 20, 2018

Fix #4889: for...range loop condition #4891

Fix #4889: for...range loop condition #4891

Uh oh!

Conversation

zdenko commented Feb 8, 2018

Uh oh!

vendethiel commented Feb 8, 2018

Uh oh!

GeoffreyBooth commented Feb 8, 2018

Uh oh!

zdenko commented Feb 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GeoffreyBooth commented Feb 11, 2018

Uh oh!

zdenko commented Feb 13, 2018

Uh oh!

GeoffreyBooth commented Feb 13, 2018

Uh oh!

zdenko commented Feb 13, 2018

Uh oh!

GeoffreyBooth commented Feb 18, 2018

Uh oh!

zdenko commented Feb 18, 2018

Uh oh!

GeoffreyBooth Feb 18, 2018

Choose a reason for hiding this comment

Uh oh!

zdenko Feb 18, 2018

Choose a reason for hiding this comment

Uh oh!

GeoffreyBooth Feb 18, 2018

Choose a reason for hiding this comment

Uh oh!

zdenko Feb 18, 2018

Choose a reason for hiding this comment

Uh oh!

GeoffreyBooth commented Feb 18, 2018

Uh oh!

zdenko commented Feb 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GeoffreyBooth commented Feb 18, 2018

Uh oh!

zdenko commented Feb 18, 2018

Uh oh!

GeoffreyBooth commented Feb 20, 2018

Uh oh!

zdenko commented Feb 20, 2018

Uh oh!

GeoffreyBooth commented Feb 20, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zdenko commented Feb 9, 2018 •

edited

Loading

zdenko commented Feb 18, 2018 •

edited

Loading