Buevest german backtrans by BueVest · Pull Request #842 · liblouis/liblouis

BueVest · 2019-09-05T12:26:52Z

Still preliminary work. G0 is about ready for testing in a wider context. G1 should be relatively easy to add.

BueVest · 2019-09-09T20:30:59Z

Apparently, the Travis build fails with the following message.
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
163
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received

What could be going wrong?
Nothing seems to fail locally.

egli · 2019-09-10T06:35:39Z

@BueVest I just restarted that particular test and now all is green. Might have been a fluke

BueVest · 2019-12-01T20:28:19Z

I think we should merge this PR now. It contains fully working back-translatable grade 0 and 1. Grade 2 will require some substantial development and a lot of work. So, I think it would be better to do that in another PR. Then we can start using g0 and g1 and gain some experience.

bertfrees · 2019-12-01T21:04:07Z

@BueVest OK, we'll see what we can do. It is a bit late that you tell this and there are a lot of changes to review. Tomorrow is release day.

bertfrees · 2019-12-01T21:16:23Z

Had a quick look. Here are some comments:

What's with the temporary test file?
Why do we need separate "bd" tables? Couldn't both forward and backward translations be defined in the same tables?
You have created quite a lot of duplication with the "bd" tables. Even if there is a need to have two sets of tables, could something be done about the duplication?
Why do we need the "bd" tests?
Changes were made to de-chardefs6.cti, but this was not reflected in the tests.
I saw a TODO ("modify copyright message...")

BueVest · 2019-12-01T22:43:35Z

You wrote:

* What's with the temporary test file?

Sorry, I thought I had already removed it. Didn’t I move it to the proper test directory?

* Why do we need separate "bd" tables? Couldn't both forward and backward translations be defined in the same tables?

BD is for “bidirectional” (feel free to rename). So, they are in fact for both forward and backward translation. Some aspects of the Braille created by the original (non-BD) tables are less suited for back-translation, e.g. the lack of capital letters, the unconditional removal of some spaces, the use of the not so detailed accented letters etc. This corresponds to the difference between the Danish 6 dots grade 1 and 2 tables and their literary counterparts. The literary tables generate beautiful Braille for reading, but it is less suited for back-translation than the Braille created by the other tables. Hope it makes sense.

* You have created quite a lot of duplication with the "bd" tables. Even if there is a need to have two sets of tables, could something be done about the duplication?

Yes, that would be nice indeed, but that would require a change of the file structure for the German tables. We could move the lines that can be shared into new files, which are then shared between the forward/literary tables and the bidirectional ones. However, if this change was to be made, I would rather make it in collaboration with the original table author, and not by myself. Currently, the chardefs file is shared, because I could make the necessary changes without affecting forward translation in the original tables.

* Why do we need the "bd" tests?

The BD tests test the bidirectional tables in both directions. The other tests test the original tables, and only in the forward direction. The files could in theory be merged, but since they test two separate sets of tables, I think they should be separate yaml files.

* Changes were made to de-chardefs6.cti, but this was not reflected in the tests.

The changes are primarily to the order of character definitions to ensure correct back-translation. They should not affect forward translation. If there are any big problems with merging now, we can do it later. I just thought it would be a good idea to gain some broader user experience with the tables now. Especially, since people are currently using the “forward only” tables for back-translation with screen readers, e.g. NVDA, and flagging back-translation problems as bugs rather than request for new features. Perhaps we should flag it as an error to use a forward table for back-translation? Just thinking…

bertfrees · 2019-12-02T11:48:37Z

Some aspects of the Braille created by the original (non-BD) tables are less suited for back-translation, e.g. the lack of capital letters, the unconditional removal of some spaces, the use of the not so detailed accented letters etc.

Yes, I assumed it was like the Danish tables, but still: couldn't they be combined in the same table? Or would that be too confusing?

We could move the lines that can be shared into new files, which are then shared between the forward/literary tables and the bidirectional ones. However, if this change was to be made, I would rather make it in collaboration with the original table author, and not by myself. Currently, the chardefs file is shared, because I could make the necessary changes without affecting forward translation in the original tables.

OK. I think it would be good to do this refactoring right from the start. This also gives us the opportunity to get some feedback from the original table author.

The BD tests test the bidirectional tables in both directions. The other tests test the original tables, and only in the forward direction. The files could in theory be merged, but since they test two separate sets of tables, I think they should be separate yaml files.

I think it might be easier to follow if the tests were grouped into the things that work the same in both versions of the tables and the things were the tables differ.

The changes [to de-chardefs6.cti] are primarily to the order of character definitions to ensure correct back-translation. They should not affect forward translation.

Hmm, are you sure? I thought I saw some actual differences?

If there are any big problems with merging now, we can do it later. I just thought it would be a good idea to gain some broader user experience with the tables now.

Yes, it sure is nice to get user feedback as soon as possible, however we shouldn't use time pressure as an excuse to do things in a sloppy way. If you would have done the PR a week ago I'm sure we would have been able to fix all the issues. I do trust you that you would fix the issues afterwards if we would merge the PR now, but still...

Perhaps we should flag it as an error to use a forward table for back-translation? Just thinking…

This is something that needs to be handled on the NVDA side ideally. Adding these kind of metadata based limitations into the library itself only works if you can select tables only through metadata, but this is not the case at the moment.

BueVest · 2020-02-17T22:09:42Z

What needs to be done here?

Concerning Bert's suggestions above:

We could refactor, so that the "core" files contain all the common stuff, and the main tables contain the stuff specific to the individual tables. However, that would be some change to the original file structure, and I would rather not do that without the approval of the owner of the original tables. Is that you, @egli ?
Refactoring the yaml tests: @bertfrees , you apparently wanted to have the tests in fewer files and grouped to show the same things for the different tables. What is that supposed to look like? As far as I remember, we can run the same tests for multiple tables, but not with different flags for each table, e.g. forward for one table and both directions for another table, unless that has been changed in the code.

Any suggestions are welcome.

bertfrees · 2020-02-18T23:20:18Z

@BueVest What happened in Git? There are two seemingly identical branches that got merged. Now the diff in Github has become unreadable. I cleaned it up locally and also rebased onto master. If you want I can push it.

bertfrees · 2020-02-18T23:56:45Z

You apparently wanted to have the tests in fewer files and grouped to show the same things for the different tables.

Yes. Well, it was just a suggestion. I think it might help in understanding how the tables differ. But I don't know how feasible it actually is. It depends on how much and what kind of differences there are. And you will indeed have to run the common tests in both directions for both tables.

Which brings us to one my others questions: Do you really need the two tables? Couldn't the backward behavior of your "bidi" table be combined with the forward behavior of the original table? Is the forward part of the "bidi" table really important to have? It's just a naive question. Maybe the tables are so different that they are not compatible at all? I don't know. I need to understand better how exactly the tables differ.

Let's assume for a moment the answer to the above question is yes (we do really need both directions of the new table). Couldn't the backward behaviors of the main and "bidi" tables be aligned? Some back-translation is always better than a back-translation that is not working at all, right?

bertfrees · 2020-02-19T10:07:37Z

I think first and foremost we need a summary of things where the two tables differ.

BueVest · 2020-02-19T16:43:33Z

Yes, the forward part is indeed important. It creates Braille that can be back-translated, whereas the Braille from the original tables can’t be back-translated. There are some distinct differences of which I have already mentioned the marking of capital letters and the processing of accented letters. If the caller could set a flag, which could then be checked by various lines in the tables, then we could probably merge the two sets (#ifdef someFlag/ #ifndef someflag / #endif), but I don’t see how this could currently be done. The whole thing about the different flavours of Braille within the same Braille code and why some flavours are more back-translatable than others is all about the different way we use Braille, i.e. reading books (possibly on paper) vs. reading and writing documents with a screen reader or on a Braille note-taker. It is a discussion which is probably appropriate to most languages, except where they have deliberately changed the Braille code to make back-translation easier, e.g. UEB. If you are interested, I will be happy to try to explain it to you in more details, but it will probably be easier over Skype than through a PR.

bertfrees · 2020-02-19T17:05:09Z

So what about the second part of my question: Could the backward behavior of the main table be the same as that from the bidi table?

Maybe I just don't know enough details, but from my vague understanding of it it sounds like both variants of the braille are not incompatible. One just contains more information than the other (like information of capitals) that make the back-translation better.

If you want to do a Skype call to discuss this PR, that is fine for me. However I think we need some written explanation of the table anyway. It's interesting for me, but I'm not the only person who needs to know. It's also and primarily the original author of the table (Christian's colleague), and other German braille people, that need to understand.

bertfrees · 2020-02-20T12:15:49Z

Regarding the tests, I gave it another thought: for dictionary tests it doesn't matter that there are a lot of files with possibly duplication. They are not meant as documentation. But it would be nice to have a YAML file that explains the differences between the main German braille code and the variant optimized for back-translation.

BueVest · 2020-02-26T17:40:11Z

So what about the second part of my question: Could the backward behavior of the main table be the same as that from the bidi table? Maybe I just don't know enough details, but from my vague understanding of it it sounds like both variants of the braille are not incompatible. One just contains > more information than the other (like information of capitals) that make the back-translation better.

Yes, you are right. A great part of the work is simply making the tables produce Braille which contains as much information as possible while still following the rules. I am not quite sure, but the original tables could probably be made to perform the same back-translation as the BD tables. In fact, they would probably perform better now than before anyway, because I have changed the order of character definitions, which are common to the two sets of tables. On the other hand, as far as I remember, the original tables have some weird work-around to get rid of unwanted capsigns. That might have to be re-worked quite a bit for proper back-translation within the same table. I could describe the differences in forward translation between the two table sets within the bd tables themselves., provided that back-translation is simply not defined for the original tables. However, describing the whole philosophy of why you would want two sets of tables? That is somewhat harder to describe in few words. I guess it is something, which is perhaps obvious to Braille users, but perhaps not so much to many others. That is why I suggested a Skype call. If I am to describe it, I need to understand what it is that is so difficult to understand about it, so to speak. Also, such an explanation should probably not be hidden away in a specific table, but rather be in the manual in the section about back-translation. It could be relevant for work on back-translation in any language that has advanced grade 2 or grade 3, and where the Braille code is not specifically made for automatic back-translation. Hope it makes sense.

bertfrees · 2020-02-26T18:55:48Z

I could describe the differences in forward translation between the two table sets within the bd tables themselves, provided that back-translation is simply not defined for the original tables.

Yes! That's exactly what I'm after.

[...] describing the whole philosophy of why you would want two sets of tables? [...] If I am to describe it, I need to understand what it is that is so difficult to understand about it, so to speak.

It is not so difficult to understand at all. I'm just asking a lot of questions (sometimes deliberately naive) to make sure that we are doing it the best possible way. For example I think whether we should try to handle both variants of the braille code within the main table (or both tables) when back-translating is a valid question. If it is indeed doable it would be a major improvement. You wouldn't need to figure out which table to select, you can just pick one and it would work.

Anyway, I'm not asking you to describe the whole philosophy of the two sets of table. I think it would indeed be a valuable addition, and useful for developing other tables, but for now all I'm asking for is a proper description of the behavior.

bertfrees · 2020-02-26T18:58:38Z

By the way can I push the cleaned up branch that I have locally or not? I want to be able to look at the combined diff on Github, that is currently not possible.

BueVest · 2020-02-26T22:01:41Z

Yes, of course you can.

…back-translated

…rs is used

… only

and make it clear in the comments that the table is unofficial and experimental.

which makes more sense as an abbreviation of "bidirectional".

bertfrees added back-translation Anything related to backward translation tables Something that needs to be fixed in table files wip Not ready yet labels Sep 5, 2019

bertfrees added the needs news Update to NEWS file needed label Dec 1, 2019

egli added this to the 3.12 milestone Dec 2, 2019

egli modified the milestones: 3.12, 3.13 Dec 2, 2019

bertfrees mentioned this pull request Feb 4, 2020

Add tests for the latest Danish table update #880

Closed

egli removed this from the 3.13 milestone Feb 17, 2020

bertfrees added the waiting The ball is not in my court (does not mean it is stuck) label Feb 17, 2020

bertfrees added needs fixup Branch needs cleaning up before it's merged. Don't press any buttons! and removed waiting The ball is not in my court (does not mean it is stuck) labels Feb 20, 2020

BueVest and others added 26 commits March 2, 2020 15:07

Corrected metadata and added copyright message

d3f8918

Added more copyright messages

ad65656

Reorganized specs test to make it easier to see what can be properly …

bdf52ab

…back-translated

Collected more tests in de-g0-bd-specs.yaml

035f0ee

Added tests for numbers before letters

6075665

Added numericmodechars and corrected percent, permille etc. with space

00cb2ed

Corrected som midnums which produce extra numsign when numericmodecha…

fcfd288

…rs is used

Fixed a couple of accented letters plus §

efb64a0

Removed xfails from dictionary tests

213869e

Changed to using unicode-without-blank.dis and tidied up the tests a bit

74adf59

Added tables to makefile

ea6bf43

Moved tests into tests directory and edited makefiles

b40339a

Created bidirectional grade 1 table

589dd87

Created specs test

8303a2e

Added de-g1-bd-core.cti and fixed section sign

ad8d0d0

Modified title of core file, added dictionary tests, modified makefiles

7cdf4d1

fixed apostrophes and added xfails to tests

4dbfc7c

Removed temp tests from repo

8a5a285

Corrected and updated copyright messages

8720355

Added message about forward translation in the bd tables

b33a49a

Removed copyright message from de-accents-detailed.cti

dbbee6f

Added message about the original tables being for forward translation…

f56b752

… only

Update metadata

e2043a3

and make it clear in the comments that the table is unofficial and experimental.

Include de-eurobrl6.dis also in regular German table

de16fd3

Rename "bd" table to "bidi"

51c6094

which makes more sense as an abbreviation of "bidirectional".

Add NEWS entry

aa108a5

bertfrees force-pushed the buevest_german_backtrans branch from bf4e0ed to aa108a5 Compare March 2, 2020 14:08

egli merged commit 8265d79 into liblouis:master Mar 2, 2020

egli removed the wip Not ready yet label Mar 2, 2020

BueVest deleted the buevest_german_backtrans branch September 1, 2025 15:41

Uh oh!

Conversation

BueVest commented Sep 5, 2019

Uh oh!

BueVest commented Sep 9, 2019

Uh oh!

egli commented Sep 10, 2019

Uh oh!

BueVest commented Dec 1, 2019

Uh oh!

bertfrees commented Dec 1, 2019

Uh oh!

bertfrees commented Dec 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BueVest commented Dec 1, 2019 via email • edited by bertfrees Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bertfrees commented Dec 2, 2019

Uh oh!

BueVest commented Feb 17, 2020

Uh oh!

bertfrees commented Feb 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bertfrees commented Feb 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bertfrees commented Feb 19, 2020

Uh oh!

BueVest commented Feb 19, 2020 via email • edited by bertfrees Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bertfrees commented Feb 19, 2020

Uh oh!

bertfrees commented Feb 20, 2020

Uh oh!

BueVest commented Feb 26, 2020 via email • edited by bertfrees Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bertfrees commented Feb 26, 2020

Uh oh!

bertfrees commented Feb 26, 2020

Uh oh!

BueVest commented Feb 26, 2020 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bertfrees commented Dec 1, 2019 •

edited

Loading

BueVest commented Dec 1, 2019 via email •

edited by bertfrees

Loading

bertfrees commented Feb 18, 2020 •

edited

Loading

bertfrees commented Feb 18, 2020 •

edited

Loading

BueVest commented Feb 19, 2020 via email •

edited by bertfrees

Loading

BueVest commented Feb 26, 2020 via email •

edited by bertfrees

Loading