Update output positions during multi pass (forward translation only)#133
Update output positions during multi pass (forward translation only)#133ArendArends wants to merge 2 commits into
Conversation
trace_translate , the general translation function in lou_translateString.c calls translatePass for pass2, 3 and 4. translatePass does currently not update the output positions. This modification tries to handle this. This may affect cursor position too.
|
See also discussion on mailing list |
also add define for old version of MS compiler
|
I like these changes but I think they need to be fixed as @bertfrees says in the discussion on the mailing list. Moved to next release |
|
Yep. |
|
Hello Christian, I hope you can add my changes to logging.c. This was accidentally added to the same pull request as Bert Frees knows. |
|
Basically commit ArendArends@1a63a3d needs to be cherry-picked |
|
OK, I cherry picked the logging changes. They will be in the release. |
|
@ArendArends @bertfrees just told me that if you provide a test for the problem that the ouptput positions aren't correct then he would fix the problem in the right place. |
|
@ArendArends would you mind providing a test for the problem that the ouptput positions aren't correct? Thanks |
|
@jcsteh Is this why NVDA translates with |
|
@dkager commented on Jun 5, 2017, 10:12 PM GMT+10:
Yes. Also, back when we started using pass1Only, the input positions array was also broken for multi-pass. That has since been fixed, though I never got around to testing/verifying that myself. |
|
@ArendArends, are you willing to take action on this? Not being able to use multi pass reliable in NVDA has some serious drawbacks. |
|
Since this pr looks a bit abandoned, is anyone willing to fork it and review the changes? I'm not very acquainted to liblouis code, but I'm happy to give a try, for example if @dkager doesn't have resources to spend on this. |
|
I'm willing to have a look if I have some good test data. |
|
I'd like to have some outputPositions and cursorPosition tests for various multipass rules. Because NVDA seems to be the only one who is using these features I think they now best what the expected behavior is and what currently goes wrong, but any help is appreciated. If I can just get a description of how the cursorPosition and outputPositions should behave for the basic operations that would already help:
Also some examples of what currently goes wrong would be nice. To ensure that the cursor never goes out of bounds sounds like an absolute minimum. The goal is to make the cursor as accurate as possible. Note that the expected behavior of inputPositions is clear (I'm using that feature myself) however some additional tests for it wouldn't hurt either. |
|
Well, I think JAWS uses them too, but apparently, they don’t use the one-pass mode. However, at least in the Danish JAWS community, there is some talk about wrong placement of the cursor after translation or back-translation, but it hasn’t been very structured yet.
I think it might be difficult for the NVDA users to test this, because one would need to enable multi-pass mode, which probably requires re-building NVDA, and that is not quite trivial.
Ok, here are some thoughts on the expected behavior:
As far as I have tested it, the cursor positions seem to make sense for pass 1. For word contractions, the cursor can only be in the first position of the word, since the word is “atomic” as liblouis is concerned. On the display, you can’t move the cursor to the b in “abv” (above).
In all partword contractions (begword, midword, always etc.), The cursor can only be in the first position of the contraction, e.g. position 1, 3 and 4 in “youngster” (5-13456-34-12456).
In the case of indicators, the cursor is apparently never on the indicator, but always on the following character. One could argue against this practice that in the cases where the indicator is created in liblouis as part of the character itself, e.g. accent indicator followed by a letter, then the cursor will actually be on the indicator and not the letter. The same goes for letters with letsign as part of their definition.
With multi-pass rules, I think it could be very simple, as all operations are essentially search and replace operations.
If the cursor is after or at the position where replacement happens, the cursor should be moved forward or back according to the number of characters or cells added or subtracted. We cannot know the nature of what is being replaced with what, e.g. if it is indicators or contractions, but I think it is the best we can do.
|
Cool, thanks!
In a way it is reasonable that Liblouis treats all these replacement rules as atomic, if you assume that rules can be written in such a way that they represent single contractions only. I know that there are cases where this assumption can not be made, namely when more context is needed. To answer this problem we could add special opcodes to provide such context (I believe LiblouisAPH has a nice way to do this). Or, we can rely on hyphenation patterns to provide the context, via the "nocross" feature. Another thing we should ask ourselves is whether contractions are by definition atomic or not? In your "youngster" example, the expected behavior is clear I think, namely that this word consists of 3 contractions, and this also matches the current behavior of liblouis:
Your "above" example is less clear: should the "b" in "abv" be mapped to the "b" in "above" or not? And what the "v"? You could argue that there are really two contractions:
OK, that makes sense. At least for letsign it does. I'm not sure what you mean by "accent indicator". If an accented letter needs a special indicator this is usually implemented with a multi-cell character definition, which results in the expected behavior.
For simple replacements it is indeed straightforward. But actually a multi-pass rule can also result in an insertion or a removal, in which case there are two possibilities for the output and input positions arrays respectively. In these cases we just have to choose one of the two possibilities, because we can not know the nature of the insertions or removals. |
|
I agree with you that rules covering more context than a single contraction will lead to unexpected cursor behavior. For instance, in Danish, we normally have to choose the latter of two possible partword contractions, which makes for lines like
nocross ger 1245-156
(there is also a ge – 12456, which must not be used in that example).
I also know that many tables define quite large word chunks as a single “always” rule.
In the “above” example, however, “above” would probably be considered atomic according to the Braille rules. Removing the v gives quite a different word “about”. In “itself” (1346-124), removing “x” gives “from”, also quite a different word. So, what does the “f” in “itself” stand for? “self”?
Letters with accent marks, e.g. 4-15, could be viewed in two ways according to the Braille rules of the language in question. Either, they can be viewed as multi cell characters, or they can be letters with an accent indicator. This will probably vary from language to language. The English examples are just for illustration.
A very interesting discussion in deed, but perhaps we should stick to multi-pass for now, and then take this up in another issue (smile)
Could you please give me an example of insertion or deletion where the resulting cursor position is unclear?
I would see a deletion as a replacement of a string with a 0 length string, and likewise with the insertion.
For example, what is the difference between
noback correct “foo”[]”bar” “---“
and
noback correct “foo”[“-“]”bar” “---“
|
Why don't you just change this into
I agree that it probably makes more sense to treat this as atomic. However when I suggested that this could be split up into the smaller contractions "bo" and "ve" I was thinking about them as optional contractions, just like the "ge" contraction in Danish is optional. In "about", the "bo" contraction looses from the "about" contraction.
In your example, "foo-bar" would be translated to "foo---bar". It is clear that the position between "o" and "-" in the input maps to the position between "o" and "-" in the output, and that the position between "-" and "b" maps to the position between "-" and "b" in the output. However when "foobar" is translated to "foo-bar" there are two possible output positions that the input position between "o" and "b" can map to: either between "o" and "-" or between "-" and "b". It's exactly the same issue as with the insertion of braille indicators. Whether the cursor should be before or after the indicator depends on the type of indicator. With deletions there is a similar ambiguity, but this time in the input positions instead of the output positions. |
|
> nocross ger 1245-156
Why don't you just change this into nocross er 156 and make sure there is a break point at g-er, which inhibits the ge 12456 rule?
Well, I might try that sometime, except, I fear that the more rules I try to account for with hyphenation, the greater the risk of making mutually incompatible rules. Good point though.
Ok, I see now what you mean by the problems in insertions and deletions. We can’t know if the insertion is a suffix to the previous character, a separator or a prefix to the next character. However, to the best of my knowledge of Braille in different languages, it is much more common to have prefixes and separators than actual suffixes. Therefore, I would let the output position stay at “-“ in the “foo-bar” example, i.e. the first possible position.
Likewise with deletions, I would choose the first of the two possible input positions. I could come up with cases where this gives an incorrect or at least debatable result, but I don’t think we can make an absolutely waterproof solution. And even if we made an opcode to choose whether to use the first or last position, many cases could probably still be argued one way or the other.
Hope it makes sense somehow
|
|
OK thanks! |
|
Ok, thinking further, I could think of quite a few cases where one might want to use the last output position in stead of the first one, especially when inserting during pass2-4 forward translation. So, it might be desirable to have a mechanism to control output and input position from the tables. However, I think it is important to establish a sensible default for now, so that NVDA can start to use Liblouis in multi-pass mode.
In the yaml tests, the cursorpos parameter seems to correspond to outpos, but apparently, there is no way to test inpos with yaml. Is that correct. Should we just expand on the current inpos and outpos c tests?
|
|
I'm not entirely sure what the YAML framework currently can and can't do. If needed we should definitely improve it. @egli do you want to answer this? |
|
As far as I remember the YAML test tool basically just exposes the functionality of |
|
Ok. Am I right in assuming that cursorpos and outpos should give the same result if compbrlAtCursor is not used, but may give different results if compbrlAtCursor is used?
|
|
If I understand correctly when you do the following: cursor_pos = cursor_pos_in;
lou_translate(table, inbuf, &inlen, outbuf, &outlen, NULL, NULL,
outpos, inpos, &cursor_pos, mode));the expression Note that there is more redundancy in the API, namely between inpos and outpos. With the current definitions of inpos and outpos, one can (almost) always be computed from the other. In the YAML tests the The example in the documentation doesn't make much sense by the way. In the actual test where it was taken from is also marked with xfail. - - you went to
- ⠽ ⠺⠑⠝⠞ ⠞⠕
- mode: [compbrlAtCursor]
cursorPos: [0,1,2,3,4,5,6,7,8,9,10]One thing about the Maybe it is easier to just forget about testing all cursor positions at once, and instead just repeat the same test with as many different cursor positions as desired. Another thing I discovered while looking at the code of lou_checkyaml.c is that the specified Regarding a good syntax for How about some thing like this: - - "you went to" # input
- "⠽⠽⠽ ⠺⠑⠝⠞ ⠞⠕" # corresponding character in braille for each character in input
- "⠽ ⠺⠑⠝⠞ ⠞⠕" # braille
- "y went to" # corresponding character in input for each character in brailleNote that if you don't have a true monospace font (Github doesn't) this doesn't really have the desired effect. Of course this is not fully unambiguous when you have test cases with repeating characters. We could also do it like the next example, it's slightly more readable but it adds even more ambiguity, namely in the spaces: - - "you went to" # input
- "⠽ ⠺⠑⠝⠞ ⠞⠕" # corresponding character in braille for each character in input
# but without repeating braille characters
- "⠽ ⠺⠑⠝⠞ ⠞⠕" # braille
- "y went to" # corresponding character in input for each character in braille
# but without repeating input charactersThe spaces that are used instead of repeated characters could possibly be replaced with underscores or another type of spaces. A completely different approach would be to indicate all the possible inter-character positions in the input and in the braille that correspond with each other. This resembles how I always like to visualize the mapping, namely by placing the input and output strings above each other and connecting the corresponding character boundaries in text and braille with lines. In a simple text format you can not quite do that, but this comes close. In this example I'm using vertical bars to indicate the positions: - - "|you| |w|e|n|t| |t|o|"
- "|⠽| |⠺|⠑|⠝|⠞| |⠞|⠕|"Note that with the current definitions of inpos and outpos you can not do things like this: - - "|foo|removal|bar|"
- "|⠋||⠃|"For typical applications this kind of information would be useless anyway. I imagine a cursor on a braille display always takes up one cell, and can't suddenly take on another form when you move it into the segment of the word that was removed. (To understand what I mean with "take on another form", try inserting in Emacs: "a", "b", <ZWSP>, "c" ( This also brings us back to the original goal of this issue: to fix the output positions. Given the (almost complete) redundancy between inpos and outpos I think the easiest way to fix this is to compute outpos from inpos (or more correctly: compute both of them from a common variable) all the way at the end of the translation. |
|
Yes, I see your point about a more visual test output. However, it would be somewhat difficult to interpret with speech, so it would be nice to also have the option of a numerical output like the current one for cursorpos.
I have made an implementation in lou_checkyaml.c for inpos and outpos using cursorPos as a model. When I have tested it, I will make a pull request, so that we have something to talk about.
For now, I think it is important to find some workable solution for the update of inpos and outpos during multi-pass, at the very least a solution that ensures that the positions do not go out of range.
All tables that use multi-pass opcodes currently work very badly with NVDA, because they apparently insist on using one-pass mode until the problem has been solved. When that happens, I will be happy to help working on an improved test output and opcodes to control the cursor position from the tables.
Concerning the example of cursorpos in the docs, you are right. The compbrlAtCursor is even totally redundant, since it is apparently always true when using cursorpos. I agree that the cursorpos should be a single int, so that you can actually test what happens with the cursor at a given position. Any tests that currently use cursorPos would then probably work with outpos.
|
Well, having the choice doesn't really help because then you're still in the same situation if you want to read a test from someone else who chose the "wrong" syntax. There should be only one syntax I think. But, I'm trying to understand... Why exactly would my syntax be more difficult to read with speech than the one with numbers? Can you explain? Note that I'm only using vertical bars as an example to explain the principle. There are various other ways to get the desired effect. And what about reading it with braille display? |
|
The short answer is that a screen or a paper is two-dimensional, while speech and Braille on a display is only one-dimensional.
On a screen or paper, you can easily see what is directly above or below something. With speech or on a Braille display, everything is presented in a serial way. You can kind of simulate two dimensions with speech by counting spaces on each line, but it is difficult to get an overview, especially with long lines or if something relates to something else several lines up or down.
On a Braille display, it is a little easier. You can keep one hand in the same place of the display while scrolling, but still, you can only see one vertical cell at a time. Solving cross word puzzles on a Braille display is not easy at all.
Of course, in an ideal world, you would have a generic test output, which the user could then present in the way that suits him or her best. In other words, separate data and presentation. That is how you make accessibility.
|
|
Ok, I thought the problem was maybe something else. Yes, of course, I understand the dimension issue. But my proposed syntax doesn't need two dimensions to be useful. You can simply check whether the segmentation of the input and the output makes sense, one at a time. A very natural thing. Of course two dimensions makes it even easier, but it is not essential. Parsing arrays of numbers in your head is more difficult in my opinion, especially in one dimension. When you are checking input positions for example, you need to keep track of where you currently are in the output string, but at the same time you need to count characters in the input string in your head. Too much thinking. |
|
I agree that in an ideal world we would be separating data and presentation, but if we can find a presentation that is easy to read for everyone I think that's preferable. |
|
Just wanted to point out that I filed nvaccess/nvda#7702, which allows NVDA users to disable the pass1only flag. This might give more information about the what, when and how of things going wrong. |
|
I have thought a lot about this lately, and I think I have an idea that makes it quite easy and leaves the ultimate responsibility with the table authors.
With the regular pass 1 rules, Liblouis normally places the cursor at the first position that the rule applies to. The same goes for the input position. All rules are atomic. An exception is when indicators are added before a letter or a number. Then the cursor is placed on the letter or number, not the indicator, which makes good sense from a Braille perspective.
The general first position placement could and probably should also be the default behavior with multi-pass rules. We cannot know the purpose of the rules, e.g. whether indicators are being added or deleted.
Then a third optional parameter could be added to multi-pass rules after the action part to specify a cursor position relative to the first character/cell that the rule actually modifies. The up arrow “^” could be used for this. It is currently not used for anything, as far as I can see from the docs. Then, a multi-pass rule could look like this:
noback pass2 []@124-135-135 @6 ^1 # put dot 6 before “foo” and place the cursor on “f”
noback correct “foo”[“-“]”bar” ? ^-1 #delete the dash and put the cursor back on “o”.
nofor correct “foo”[]”bar” “-“ # Add “-“ and let the cursor stay there.
In principal, the cursor syntax could also be used with normal pass 1 rules, except, I don’t see just now what it should be good for.
How does that sound?
|
|
Before adding more complexity, I would first like to try if people can achieve the desired cursor behavior by rewriting their multipass rules if needed. For example, the following rules do the same thing but result in a different cursor mapping: |
|
I'm going to close this now. The tests are being worked on in #430. |
|
I"m not exactly sure whether I understand the current status of this. Are there still noticeable bugs which will be covered by #430? |
|
Yes the bug hasn't been fixed. See the tests at https://github.com/BueVest/liblouis/blob/222dd44eb695c118dc9940c4ec0b8759b43680de/tests/yaml/inpos_outpos.yaml marked with xfail. I'm currently working on it and it will be included in release 3.4. |
trace_translate , the general translation function in
lou_translateString.c calls translatePass for pass2, 3 and 4.
translatePass does currently not update the output positions. This
modification tries to handle this.
This may affect cursor position too.