Conversation
The input is generated by a custom Reader.
…er that repeats the given content, to reduce iterations when we read a large amount of data.
We actually need a lot of RAM to make this test pass.
> By default a container is given 2 CPUs and 4 GB of memory but it can be configured in .cirrus.yml https://cirrus-ci.org/guide/linux/
Don't prepare a large buffer if we read only a few characters.
And remove old `Yyxlex.java` which was submitted (probably by mistake) in commit 4db21cb
lsf37
left a comment
There was a problem hiding this comment.
I'm happy with this, esp generating the large input is nice.
We should announce this change fairly prominently, because it is API-breaking: users will have to make the same change we had to make in the example.
I wonder if we should make the type configurable (in a separate PR). The ability to handle input files >2GB is not necessarily what everyone needs, some might prefer not having to change their code.
|
Thanks for the positive feedback, Gerwin. My opinion is that:
That being said, I think 1.8 has accumulated enough changes to be released. |
Author: Régis Décamps <regisd@google.com>
Date: Wed Nov 27 00:02:51 2019 +0100
Make `yychar` long (#605)
* Replace `yychar` by long in skeletons (default & nested)
* Update example for long yyChar
* Also, add some positivity assertion in `example/simple/Yytoken` to give some defensive programming as an example.
* Update manual and READMe
* Update test added in #603 .
* Improve doc of RepeatContentReader
* Improve RepeatContentReader
Don't prepare a large buffer if we read only a few characters.
* Also update testcase/example/simple for long `yychar`.
* Delete testcase/simple which is just redundant with the example.
Updated from target/jflex-parent-1.8.0-SNAPSHOT-sources.jar
|
Yes, I meant making it prominent in the release announcement (just because API breaking changes are rare for JFlex), I agree that we don't want to announce it before. |
|
Forgot to say: I agree that 1.8 is getting there. Still labouring under the illusion that I'll manage to pull in the char class macros (#216) into 1.8. Recently made some progress on that, but there is still some way to go. |
Introduce a fakeRead() that changes `yychar` so that we can fakely jump to yychar just before Integer.MAX_VALUE and make the test run in a snap. Follow-up of jflex-de#603 and jflex-de#605
|
Follow-up comment for those who hit this breaking change. There are two options to fix it:
|
Starting from JFlex 1.8.0, `yychar` is now a long. However antlr `CommonToken` only takes an int, hence explicit casting is needed. I propose to use `java.lang.Math.toIntExact()` which throws ArithmeticException in case of overflow. This is acceptable, because the application will have an illegal state today (negative character position caused by the int overflow). See jflex-de/jflex#605 for more details.
Since JFlex 1.8.0, the `yychar` is a long. However, it is used to build an antlr `CommonToken` which takes an int. I propose to cast the `yychar` with `java.lang.Math.toIntExact()` which throw an ArithmeticException in case of overflow. This is acceptable because the previous code would have returned an overflowed (negative) position. See jflex-de/jflex#605 for details.
Starting from JFlex 1.8.0, `yychar` is a long. However, it is used to build an antlr `CommonToken` which takes an int. Hence, explicit casting is needed. I propose to cast the `yychar` with `java.lang.Math.toIntExact()` which throw an ArithmeticException in case of overflow. This is acceptable because the previous code would have returned an overflowed (negative) position. See jflex-de/jflex#605 for details.
Starting from JFlex 1.8.0, `yychar` is a long. However, it is used to build an antlr `CommonToken` which takes an int. Hence, explicit casting is needed. I propose to cast the `yychar` with `java.lang.Math.toIntExact()` which throw an ArithmeticException in case of overflow. This is acceptable because the previous code would have returned an overflowed (negative) position. See jflex-de/jflex#605 for details.
Fix #536.
When the input is larger than 2GB, the scanner internal state for the number of read characters (aka
zzchar) is negative. This obviously doesn't make sense.The test added in #603 reproduces the problem and throws an exception, when the
zzcharis negative.The problem is caused by
yycharbeing anint, hence subject to overflow.Fix #558 by making
yycharlong. This is similar to #558 by @sarowe with a lighter approach for testing.Note that some problems remain:
yylineandyycolumnare still integers.zzRefill()will throw aNegativeArraySizeExceptionlike this