Implemented inpos and outpos in lou_checkyaml.c.#430
Conversation
… to brl_checks.c to distinguish between inpos, outpos and cursorpos.
| } else if (!strcmp(option_name, "cursorPos")) { | ||
| *typeform = read_typeforms(parser, wordLen); | ||
|
|
||
| // FIXME: Comparisons should be truly case insensitive, but apparently no standard function for that. |
There was a problem hiding this comment.
I would personally prefer that the option names are not case insensitive at all.
There was a problem hiding this comment.
There were already some inconsistancy between cursorPos (cap P) and inpos and outpos in other contexts (small p). You could say that cursor position is clearly two words, while inpos and outpos are not necesarily two words. Nevertheless, it is a potential source of confusion. So, I thought we should be lenient on cases. Either that or stick to lower case completely.
There was a problem hiding this comment.
Let's just call it "inputPos", "outputPos" and "cursorPos" for now. Hopefully we can abstract out inputPos and outputPos in the future anyway. And eventually when cursorPos is only a single number I'd like to rename it to simply "cursor" ("pos" is kinda redundant).
|
I had a quick look and it looked alright. @egli maybe you can have a look too? Ideally I'd like to have a better syntax, but right now I don't want to get in the way of Bue who wants to actually write some tests. That is the most important thing right now. |
|
Having only looked at it very briefly I can say that it looks good. There are maybe a few things that I would have done differently and there is a buffer overrun lurking. It could probably go in and I can clean it up afterwards. But what I'm really curious is: How does a inpos YAML test look like? |
|
You'll see when Bue has made some :) |
egli
left a comment
There was a problem hiding this comment.
Looks good generally. You should be the maintainer :-)
| if ((const char *)event.data.scalar.value == tail) | ||
| error_at_line(EXIT_FAILURE, 0, file_name, event.start_mark.line + 1, | ||
| "No digits found in inpos position '%s'. Must be a number\n", | ||
| event.data.scalar.value); |
There was a problem hiding this comment.
This is a buffer overrun waiting to happen. Never read more than you allocated. Add a condition to the while loop (i has to be smaller than len).
There was a problem hiding this comment.
It was code that I took from the original read_cursorPos() function. I will have a look at it.
There was a problem hiding this comment.
Ah, hehe. I guess at the time I hadn't been burned by the CVEs. Yes looking at it it looks like it has the same problems. Basically it should be smth along the line of
while ((parse_error = yaml_parser_parse(parser, &event)) &&
(event.type == YAML_SCALAR_EVENT) &&
(i < len)) {in all places
| if ((const char *)event.data.scalar.value == tail) | ||
| error_at_line(EXIT_FAILURE, 0, file_name, event.start_mark.line + 1, | ||
| "No digits found in outpos position '%s'. Must be a number\n", | ||
| event.data.scalar.value); |
| "Not a valid outpos position '%s'. Must be a number\n", | ||
| event.data.scalar.value); | ||
| if ((const char *)event.data.scalar.value == tail) | ||
| error_at_line(EXIT_FAILURE, 0, file_name, event.start_mark.line + 1, |
There was a problem hiding this comment.
I think this code path indicates that chars other than digits where found in the value, so there could be digits but there definitely are also other non-digits characters. Maybe adapt the error message.
There was a problem hiding this comment.
Forget what I said here. I think the error message is good enough
| *typeform = read_typeforms(parser, wordLen); | ||
|
|
||
| // FIXME: Comparisons should be truly case insensitive, but apparently no standard function for that. | ||
| } else if (!strcmp(option_name, "inPos") || !strcmp(option_name, "inpos")) { |
There was a problem hiding this comment.
Not sure if I like to API so relaxed, i.e. allowing multiple ways to specify the test ("inpos" and inPos"). Do we do that in other places too?
There was a problem hiding this comment.
No I don't think so. I agree that one way should be enough.
| if (xfail != check_cursor_pos(*table, word, cursorPos)) { | ||
| if (description) fprintf(stderr, "%s\n", description); | ||
| error_at_line(0, 0, file_name, event.start_mark.line + 1, | ||
| if (inPos || outPos || cursorPos) { |
There was a problem hiding this comment.
Why is there this if a or b or c, followed by if a? Couldn't this just be a series if if a else if b else if c?
There was a problem hiding this comment.
Bue just adapted to the existing else if (hyphenation) below. I think there should be an error message though when cursorPos is specified in hyphenation test mode.
|
Here is an example that I made to fail to check what it looks like:
This produces the following result: Inpos failure: Note that "altid" (always) is a word contraction, so it is atomic. All positions should be 0. |
|
I'd just merge, we can fix things afterwards if needed. |
|
@bertfrees are you talking to me? |
|
@egli you're the only one here. |
|
Please, wait a sec. I am editing just now. I would also like to add to the documentation.
|
|
OK. |
|
How does the code look now? The following tests are the same as those given as examples in the doc. They can be copied into a file, e.g. inpos_outpos_simple.yaml, to illustrate how it works. However, they depend on en-us-ctb, so, perhaps they should not be merged into the project table: [tables/unicode.dis, tables/en-us-g2.ctb]
tests:
-
- went
- ⠺⠢⠞
- inputPos: [0,1,3]
-
- went
- ⠺⠢⠞
- outputPos: [0,1,1,2]
-
- went
- ⠺⠑⠝⠞
- cursorPos: [0,1,2,3] |
|
Great. Looks very good! The only thing I wouldn't have done are these checks: if (val >= wrdlen)
error_at_line(EXIT_FAILURE, 0, file_name, event.start_mark.line + 1,
"Input position too great: %d.\n" +
"In this test, input position cannot be greater than %d.", val, wrdlen-1);They are not needed to make the code safe. It is true that these conditions need to be met for the test to pass, however there are more conditions (e.g. the array must be monotonically non-decreasing) and these conditions are automatically checked by the test itself. But don't worry about this now. We can tweak things later. |
|
@bertfrees well I think the test is usefull to give decent error messages. Are these tests faulty or is there a problem with the |
| passed in using the ``flow style'' notation. | ||
| behavior as in the following example: | ||
| note that compbrlAtCursor is implicitly specified for all cursor positions. This makes this test suitable only for testing a single word, since the translation would otherwise vary according to the cursor position. | ||
|
|
There was a problem hiding this comment.
I see that brl_checks.c does indeed hard code the mode when invoking the cursor position tests. You seem to say that this is wrong. But should we not fix this behaviour instead of documenting it?
Ok but then I'd say implement the other conditions (such as monotonically non-decreasing) too. Anyway, let's not put any more effort in it now. This PR was meant to implement a quick solution, not a perfect one. |
|
Me too. But I didn’t know if it was intended or unintended. I also think that cursorPos should be one int, not a list, but, as Bert says, let’s do that later.
I would now like to make the tests that Bert originally asked for to show expected and actual behavior for multi-pass. Perhaps that should be in another PR.
|
|
Well, I'm happy to merge. But I need the test suite to pass. We can either
|
|
Just comment out those tests. They suck. |
|
Alternatively Bue could just base his new PR with the tests on this one. Then we have some time to fix the failing tests. But honestly I would just wait with that until the cursorPos test feature is fixed (see #133 for what is wrong with it) and look at the original JSON tests for the expected output. |
|
Excuse me. I can see why the harness test fails, but why the inpos test. I don’t think I changed anything about that one.
|
|
Sorry, I actually did modify the inpos test, but it was on another branch, which I didn’t commit, so I forgot about it.
|
|
Just comment out those tests. They suck.
Do you mean that? I can easily do it, but I wouldn’t dare do it by myself.
|
|
Well, looking at the comments around those tests makes me believe that those tests do kinda suck. But nevertheless I think there is some problem with the |
|
Yes, now the context rules would act like the rest of the multi-pass rules, i.e. move the cursor to the first position after the inserted content. This is in accordance with the manual, and it may be fine when you are inserting something, e.g. a Braille indicator, but it fails when you do a replacement like in the tests where “-“ (@36) is replaced with “,-“ (@6-36) depending on the pass. In these cases, when the input cursor is on a “-“, the output cursor will be on the next letter, e.g. “o”, because “-“ is being replaced with “,-“.
Admittedly, I could have avoided this in the test by not replacing “-“ but simply inserting “,”, but in many real life situations, you have to do a replacement, not just an insertion. Da-dk-g28.ctb contains quite a few examples of this.
Currently, the context rules act more like the other pass 1 rules, i.e. placing the cursor on the first cell of the replacement string. It would be a possibility to do that also with the other multi-pass rules. That may be fine for replacement rules, but then, in the case of just inserting a Braille indicator, the cursor would be placed on the indicator and not on the following cell like with the normal Braille indicators supported directly by Liblouis.
I hope you can see the dilemma here. This is why I landed on a solution where the table author can specify the cursor position relative to the start of the replacement string. This may of course be some work to implement. So, until we can do that, perhaps the best solution would be to have all multi-pass rules act like the normal pass 1 rules, i.e. for each rule, let the first character/cell of the “string to be replaced” correspond with the first character/cell of the “replacement string”. If the string to be replaced is empty ([]), the following position in the input should be used, which should take care of some of the rules to insert extra indicators.
Like I wrote, this may not create a perfect result in all cases, but the logic will be consistent through all passes, and later, we can compensate by letting the table authors specify another cursor position relationship for each rule.
I hope this all makes sense.
Anyhow, I think we should merge, like we agreed last week. Then I will add a new test file with extensive pass 1 inpos and outpos tests to keep it separate from the multi-pass issues.
What do you think?
|
…sts are commented out, because they will pass or fail at random.
…s or outputPos. Uncovers a fail in fr-bfu-g2_harness.yaml (marked with xfail).
Also renamed some variables.
Previously it was only being updated when inputPositions and outputPositions are not both NULL, while in other functions it was updated unconditionally. Note that posMapping is not only used for computing inputPositions and outputPositions, but also for updating typebuf after pass 0. However for_updatePositions is not used in pass 0 which is why emphasis was working fine.
222dd44 to
c772eac
Compare
|
@egli Travis doesn't build this branch anymore. It says "GitHub payload is missing a merge commit (mergeable_state: "unknown", merged: false)". What should I do? |
|
I pushed it to a new branch, now it's building. |
|
If it passes I'll merge, ok? |
|
Sure feel free to merge, just be careful with merging |
|
Merged in ac59139 |
When approved, I will make multi-pass tests for inpos and outpos