Add Bangla locale to windowsPrimaryLCIDsToLocaleNames by seanbudd · Pull Request #13339 · nvaccess/nvda

seanbudd · 2022-02-14T06:33:24Z

Link to issue number:

None

Summary of the issue:

The latest PR (#13338) to merge beta to master has failed, due to Bangla translations being introduced and NVDA not fully supporting them yet. AppVeyor build failure

Description of how this pull request fixes the issue:

Add Bangla locale to windowsPrimaryLCIDsToLocaleNames

Testing strategy:

Create a try build of beta merged into this PR: Note that both unit and system tests pass

Known issues with pull request:

None

Change log entries:

None

Code Review Checklist:

Pull Request description:
- description is up to date
- change log entries
Testing:
- Unit tests
- System (end to end) tests
- Manual testing
API is compatible with existing add-ons.
Documentation:
- User Documentation
- Developer / Technical Documentation
- Context sensitive help for GUI changes
UX of all users considered:
- Speech
- Braille
- Low Vision
- Different web browsers
- Localization in other languages / culture than English

CyrilleB79 · 2022-02-14T09:21:59Z

Hello

Here are 4 threads on the translators mailing list regarding Bangla translation: thread 1, thread 3, thread 3, thread 4

To summarize two translators are willing to translate NVDA to Bangla/Bengali and we are continuing discussion privately to help them do the job. For now:

I have added files in SRT repo to have a starting point for Bangla translation
The Bangla character description file already exists in NVDA so I have copied it to the SRT repo.
One translator has done a part of the symbol translation but is blocked trying to commit his job with SVN
The other translator has expressed the will to translate the interface and the user guide. However, I do not know how far is his work and if he has succeded in starting it: I have no news from him for some weeks.

For now only a translated symbol file can be added to NVDA 2022.1. But with the news I have today, I doubt that the interface file will be ready.

What do you suggest to do? For now, I think that Bangla has already appeared in the language list of General settings window in beta branch (to be confirmed). But the interface will be exclusively in English for now.

Thanks.

CyrilleB79 · 2022-02-14T09:26:57Z

In addition, here are review feedbacks on this PR:

Could you describe in this PR how windowsPrimaryLCIDsToLocaleNames is obtained? The comment says:
# Generated from: {x&0x3ff:y.split('_')[0] for x,y in locale.windows_locale.iteritems()}
But:
- This comment should at least be updated to Python 3 (iteritems -> items)
- This comment does not seem true anymore since there are values above 1023 (0x3FF) in the dictionnary.
It would be interesting to explain in the comments or at least in the PR why adding this line is needed whereas there is already a bn item in the dictionnary:
69: 'bn'
The same for sr that appears two times (26 and 9242).
Why do values above 1023 appear in this dictionnary (1170: 'ckb', 1109: 'my', 1143: 'so')?

josephsl · 2022-02-14T10:07:11Z

Hi, I vote to delay Bengali to 2022.2. If we caught this in say, late January, I think it would have made it to 2022.1. However, given that beta to master and master to beta exchange was declared today, I think we don’t have time to get a new language into NVDA, especially given that it will take several weeks to test the user interface fully. Folks can say that we can improve things during the beta cycle, but it is a bit risky this time since we are dealing with a year.1 beta, which puts pressure on not only translators, but also on add-ons community as well. Another option is delay beta to master merge for about a week or so in hopes that Bengali interface is in, but then it puts pressure on translators to complete their work and test the interface. Thanks.

lukaszgo1 · 2022-02-14T11:57:03Z

[...]

* I have added files in SRT repo to have a starting point for Bangla translation

I'm not familiar with the translations system (perhaps this was always done as you've described above) so please don't take this as a criticism but IMO introducing a template .po file into SRT until it is at least partially translated causes more trouble than it is worth. Aside from the issue that there is a new language which exists in preferences but does not contain translatable messages it just pollutes the SVN commit history unnecessarily. I tent to agree with @josephsl that the best course of action for now would be to remove the .po template from SVN and document that only partially translated files should be committed into SVN to avoid similar issues in the future.

CyrilleB79 · 2022-02-14T13:41:23Z

No problem for criticism. I try to do my best to drive Bangla translators since no one else had answered their messages. But that's the first time I do it, so I am opened to any remark to improve.

I agree that Bangla translation should not be included in 2022.1 since it is clear that will not be mature enough for that time. At the beginning, I had thought that it would be interesting to have a working translation framework but I realize now that I should have waited a bit more. Sorry for the confusion.

It should now be determined the best way to exclude Bangla translation. I think that:

In beta branch, the bn nvda.po should be removed (to avoid Bangla to appear in the language list for now)
and
An action in the SVN repo should be done to avoid the copy of nvda.po. It can be:
- removing nvda.po for now from SVN repo
- another action in the settings file to avoid triggering the copy of the nvda.po again from SVN to beta branch

If nvda.po can remain in the SRT repo, it would be better since it can help the translator to learn to use SVN. If not, no problem, let's remove it for now.

Let's wait for NVAccess comments to decide how to resolve this issue.
Cc @seanbudd, @feerrenrut

josephsl · 2022-02-14T16:39:55Z

Hi,

After thinking about this for a while, it appears the script used to check translation progress didn't work as intended. Normally when merging in po files from Subversion (SVN) to Git, a script will ensure that at least 70% of messages were translated (in this case, at least 1805 translated messages). Therefore, it is okay to leave bn/nvda.po in the translations workflow (SVN).

As for the overall purpose of this PR: we add language codes if Windows does not support it (see #8538 for a case for Kurdish). I expect Bengali (bn) is not a supported LCID by default, so it makes sense to proceed with this PR by adding it in LCID map. However, as I noted earlier, introducing a new language just as we are exchanging data between master and beta branches is risky. This is riskier now because:

There is no word from Bengali translators about adjusting to translations workflow (I did review the threads noted here). If at least one message was translated, then I think it makes sense to proceed with this PR.
Building from item 1, no messages were translated into Bengali at this time.
If it turns out to be an issue with the translations workflow script, then it might be possible that other cases i.e. untranslated languages could emerge.

One thing I have learned as a translator and managing translations forum and workflow in general: timing and communication are very important. More importantly, willingness by translators to communicate with the community, learn about the workflow, and keeping translations updated and as accurate as possible (not just messages, but also being sensitive to cultural expectations of the local community) are important, especially considering that NVDA is promoted as a multilingual screen reader.

Therefore, in light of the situation here, and since Bengali translators are new to translations workflow and contribution in general, I propose to look at the check translation script to make sure this does not happen again. Practically speaking, I would like to suggest the following actions:

Remove bn/nvda.po from NVDA beta branch.
Check translations workflow script to see what's going on.
Do not proceed with this PR until folks get communication going with Bengali translators. In other words, I propose not setting a milestone for this PR.
In the near term (hopefully when NVDA 2022.1 beta is in circulation), contact translators listed in addresses database about their commitment to NVDA translations, specifically for languages that hasn't seen any activity whatsoever for at least a year. If translators say they are willing to maintain their languages, have a discussion with them so they can work on l10n data between 2022.1 release and 2022.2 beta. If not, consider deleting l10n data from SVN.

As for item 4, I think it is something that translations list should have a serious discussion about (note that I'm not part of the translations list anymore).

Thanks.

lukaszgo1 · 2022-02-14T19:16:25Z

After thinking about this for a while, it appears the script used to check translation progress didn't work as intended. Normally when merging in po files from Subversion (SVN) to Git, a script will ensure that at least 70% of messages were translated (in this case, at least 1805 translated messages).

Has this ever worked for translations to NVDA? Looking at scripts in mrConfig this check applies only to the translations for add-ons. I agree extending it to the .po files for NVDA is a right thing to do.

josephsl · 2022-02-14T19:20:01Z

Hi, yes as the check po script was originally intended for NVDA Core and later extended to cover add-ons. Thanks.

seanbudd · 2022-02-15T07:40:10Z

@CyrilleB79 -

Could you describe in this PR how windowsPrimaryLCIDsToLocaleNames is obtained? The comment says:
# Generated from: {x&0x3ff:y.split('_')[0] for x,y in locale.windows_locale.iteritems()}
But:
- This comment should at least be updated to Python 3 (iteritems -> items)
- This comment does not seem true anymore since there are values above 1023 (0x3FF) in the dictionnary.

That's correct. I think this PR needs to be updated to be more explicit about how this is obtained.

It would be interesting to explain in the comments or at least in the PR why adding this line is needed whereas there is already a bn item in the dictionnary:
69: 'bn'
The same for sr that appears two times (26 and 9242).

Hopefully this should be covered in the description of the variable. Languages can be mapped to many locales (eg en_AU, en_US). This is a map of locale identifiers in Windows, to just their language code (with other parts of the locale stripped).
In this case, this locale is used in the translation system, but is not listed by locale.windows_locale.items().

Why do values above 1023 appear in this dictionary (1170: 'ckb', 1109: 'my', 1143: 'so')?
These are additions as we have these locales in the translation system, but they are not listed by locale.windows_locale.items().

These items should probably be split out into windowsLCIDsToLocaleNames and used when they are needed.

We should probably avoid relying on LCIDs in future: LCID deprecation

seanbudd · 2022-02-15T07:44:48Z

The existing work done in characterDescriptions.dic and cldr.dic is valuable, and an improvement to NVDA.
Instead, we can just hide bn from the UX.

josephsl · 2022-02-15T07:46:57Z

Hi, I think that’s a doable short-term solution. Of course the long-term solution is communicating with Bengali translators. Thanks.

josephsl · 2022-02-15T07:52:51Z

Hi, as a follow-up thought: I think we might as well use this opportunity to spell out the order of things to be translated and committed. The order of precedence (as noted in “translating NVDA” document) is user interface (nvda.po) as highest of highest of priorities, followed by character descriptions and symbols, and if time permits, user guide and what’s new document, and based on feedback from the language community, add-ons. Of course we should also think about a possibility like the one we are facing at the moment (no user interface translation but we do have character descriptions and CLDR data). Thanks.

seanbudd · 2022-02-15T08:13:20Z

@josephsl
For now we propose keeping 'bn' in the UI. we already have several other languages with low completion, there isn't any harm for a user selecting 'bn' (missing translations fall back to English), and we don't want to have to manually reenable 'bn' later.

Also, discussions surrounding translations are better suited to a separate issue/discussion or the mailing list.

josephsl · 2022-02-15T08:17:08Z

Hi, in that case, let’s do as proposed – keep bn interface and transfer communication with translators to translations list. Thanks.

CyrilleB79 · 2022-02-15T08:27:38Z

In the beginning, I have asked bn translators (Fahim FARHAN ISHRAK and Abu Faraj ) to continue e-mail exchanges privately to set up SVN and so on since they needed very basic instructions and I did not want to clutter the list with a lot of messages, the translator mailing list being described as a low-traffic mailing list.

But given our discussion here, I understand the need to have exchange with bn translators public on the translator mailing list. I will send a message to ask them to use the list instead of our private e-mail exchanges from now on.

lukaszgo1 · 2022-02-15T12:45:06Z

The existing work done in characterDescriptions.dic and cldr.dic is valuable, and an improvement to NVDA. Instead, we can just hide bn from the UX.

These weren't added by the new translations team for Bengali and therefore are out of scope here:

cldr.dict was generated automatically and was present in 2021.3
character descriptions also aren't new in 2022.1.

For now we propose keeping 'bn' in the UI. we already have several other languages with low completion, there isn't any harm for a user selecting 'bn' (missing translations fall back to English), and we don't want to have to manually reenable 'bn' later.

Introducing a new language without a single translated message is pretty illogical - for an end user who decides to switch their NVDAto Bengali it can even be perceived as a bug.

seanbudd · 2022-02-16T01:32:43Z

Closing in favour of #13342.
Note that the addition of Bangla to beta is due to the translation system, and introducing it to alpha will happen with a future merge of beta to master.
These PRs are for improving locale support, in preparation of this.

@CyrilleB79

Summary of the issue: The latest PR (#13338) to merge beta to master has failed, due to Bangla translations being introduced and windowsPrimaryLCIDsToLocaleNames not listing the locale. AppVeyor build failure. On #13339, @CyrilleB79 raised that this variable is poorly documented and outdated. The initial variable provided a mapping of LCIDs to language codes, without the rest of the locale. We now normalize the locale as necessary in normalizeLanguage instead. On further inspection, it appears that the only additions from locale.windows_locale are as follows: { 1170: 'ckb', 1109: 'my', 1143: 'so', + 2117: 'bn', 9242: 'sr', } As locale.windows_locale is incomplete, these were introduced to ensure languages translated in NVDA could be mapped. Instead, the Windows function LCIDToLocaleName can be used to get each of these locales. This was suggested in #4203. However Windows maps 1170 to "ku-Arab-IQ" not "ckb", and a translation is added for Central Kurdish in localesData.LANG_NAMES_TO_LOCALIZED_DESCS["ckb"]. NVDA may drop "Arab-IQ" from this locale to get the language, losing the locality of "Central Kurdish". Description of how this pull request fixes the issue: Removes windowsPrimaryLCIDsToLocaleNames, instead use LCIDToLocaleName after checking for an internal mapping (eg for "ckb").

Add Bangla locale to windowsPrimaryLCIDsToLocaleNames

ab624b8

This was referenced Feb 14, 2022

Merge beta to master #13338

Merged

Merge master to beta #13337

Merged

seanbudd marked this pull request as ready for review February 14, 2022 07:17

seanbudd requested a review from a team as a code owner February 14, 2022 07:17

seanbudd requested a review from michaelDCurran February 14, 2022 07:17

seanbudd mentioned this pull request Feb 16, 2022

Remove windowsPrimaryLCIDsToLocaleNames #13342

Merged

11 tasks

seanbudd closed this Feb 16, 2022

seanbudd deleted the addBangla branch February 16, 2022 02:26

Uh oh!

Conversation

seanbudd commented Feb 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Link to issue number:

Summary of the issue:

Description of how this pull request fixes the issue:

Testing strategy:

Known issues with pull request:

Change log entries:

Code Review Checklist:

Uh oh!

CyrilleB79 commented Feb 14, 2022

Uh oh!

CyrilleB79 commented Feb 14, 2022

Uh oh!

josephsl commented Feb 14, 2022 via email

Uh oh!

lukaszgo1 commented Feb 14, 2022

Uh oh!

CyrilleB79 commented Feb 14, 2022

Uh oh!

josephsl commented Feb 14, 2022

Uh oh!

lukaszgo1 commented Feb 14, 2022

Uh oh!

josephsl commented Feb 14, 2022 via email

Uh oh!

seanbudd commented Feb 15, 2022

Uh oh!

seanbudd commented Feb 15, 2022

Uh oh!

josephsl commented Feb 15, 2022 via email

Uh oh!

josephsl commented Feb 15, 2022 via email

Uh oh!

seanbudd commented Feb 15, 2022

Uh oh!

josephsl commented Feb 15, 2022 via email

Uh oh!

CyrilleB79 commented Feb 15, 2022

Uh oh!

lukaszgo1 commented Feb 15, 2022

Uh oh!

seanbudd commented Feb 16, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

seanbudd commented Feb 14, 2022 •

edited

Loading