Skip to content

Add Bangla locale to windowsPrimaryLCIDsToLocaleNames#13339

Closed
seanbudd wants to merge 1 commit into
masterfrom
addBangla
Closed

Add Bangla locale to windowsPrimaryLCIDsToLocaleNames#13339
seanbudd wants to merge 1 commit into
masterfrom
addBangla

Conversation

@seanbudd

@seanbudd seanbudd commented Feb 14, 2022

Copy link
Copy Markdown
Member

Link to issue number:

None

Summary of the issue:

The latest PR (#13338) to merge beta to master has failed, due to Bangla translations being introduced and NVDA not fully supporting them yet. AppVeyor build failure

Description of how this pull request fixes the issue:

Add Bangla locale to windowsPrimaryLCIDsToLocaleNames

Testing strategy:

Create a try build of beta merged into this PR: Note that both unit and system tests pass

Known issues with pull request:

None

Change log entries:

None

Code Review Checklist:

  • Pull Request description:
    • description is up to date
    • change log entries
  • Testing:
    • Unit tests
    • System (end to end) tests
    • Manual testing
  • API is compatible with existing add-ons.
  • Documentation:
    • User Documentation
    • Developer / Technical Documentation
    • Context sensitive help for GUI changes
  • UX of all users considered:
    • Speech
    • Braille
    • Low Vision
    • Different web browsers
    • Localization in other languages / culture than English

This was referenced Feb 14, 2022
@seanbudd seanbudd marked this pull request as ready for review February 14, 2022 07:17
@seanbudd seanbudd requested a review from a team as a code owner February 14, 2022 07:17
@CyrilleB79

Copy link
Copy Markdown
Contributor

Hello

Here are 4 threads on the translators mailing list regarding Bangla translation: thread 1, thread 3, thread 3, thread 4

To summarize two translators are willing to translate NVDA to Bangla/Bengali and we are continuing discussion privately to help them do the job. For now:

  • I have added files in SRT repo to have a starting point for Bangla translation
  • The Bangla character description file already exists in NVDA so I have copied it to the SRT repo.
  • One translator has done a part of the symbol translation but is blocked trying to commit his job with SVN
  • The other translator has expressed the will to translate the interface and the user guide. However, I do not know how far is his work and if he has succeded in starting it: I have no news from him for some weeks.

For now only a translated symbol file can be added to NVDA 2022.1. But with the news I have today, I doubt that the interface file will be ready.

What do you suggest to do? For now, I think that Bangla has already appeared in the language list of General settings window in beta branch (to be confirmed). But the interface will be exclusively in English for now.

Thanks.

@CyrilleB79

Copy link
Copy Markdown
Contributor

In addition, here are review feedbacks on this PR:

  1. Could you describe in this PR how windowsPrimaryLCIDsToLocaleNames is obtained? The comment says:
    # Generated from: {x&0x3ff:y.split('_')[0] for x,y in locale.windows_locale.iteritems()}
    But:

    • This comment should at least be updated to Python 3 (iteritems -> items)
    • This comment does not seem true anymore since there are values above 1023 (0x3FF) in the dictionnary.
  2. It would be interesting to explain in the comments or at least in the PR why adding this line is needed whereas there is already a bn item in the dictionnary:
    69: 'bn'
    The same for sr that appears two times (26 and 9242).

  3. Why do values above 1023 appear in this dictionnary (1170: 'ckb', 1109: 'my', 1143: 'so')?

@josephsl

josephsl commented Feb 14, 2022 via email

Copy link
Copy Markdown
Contributor

@lukaszgo1

Copy link
Copy Markdown
Contributor

[...]

* I have added files in SRT repo to have a starting point for Bangla translation

I'm not familiar with the translations system (perhaps this was always done as you've described above) so please don't take this as a criticism but IMO introducing a template .po file into SRT until it is at least partially translated causes more trouble than it is worth. Aside from the issue that there is a new language which exists in preferences but does not contain translatable messages it just pollutes the SVN commit history unnecessarily. I tent to agree with @josephsl that the best course of action for now would be to remove the .po template from SVN and document that only partially translated files should be committed into SVN to avoid similar issues in the future.

@CyrilleB79

Copy link
Copy Markdown
Contributor

No problem for criticism. I try to do my best to drive Bangla translators since no one else had answered their messages. But that's the first time I do it, so I am opened to any remark to improve.

I agree that Bangla translation should not be included in 2022.1 since it is clear that will not be mature enough for that time. At the beginning, I had thought that it would be interesting to have a working translation framework but I realize now that I should have waited a bit more. Sorry for the confusion.

It should now be determined the best way to exclude Bangla translation. I think that:

  1. In beta branch, the bn nvda.po should be removed (to avoid Bangla to appear in the language list for now)
    and
  2. An action in the SVN repo should be done to avoid the copy of nvda.po. It can be:
    • removing nvda.po for now from SVN repo
    • another action in the settings file to avoid triggering the copy of the nvda.po again from SVN to beta branch

If nvda.po can remain in the SRT repo, it would be better since it can help the translator to learn to use SVN. If not, no problem, let's remove it for now.

Let's wait for NVAccess comments to decide how to resolve this issue.
Cc @seanbudd, @feerrenrut

@josephsl

Copy link
Copy Markdown
Contributor

Hi,

After thinking about this for a while, it appears the script used to check translation progress didn't work as intended. Normally when merging in po files from Subversion (SVN) to Git, a script will ensure that at least 70% of messages were translated (in this case, at least 1805 translated messages). Therefore, it is okay to leave bn/nvda.po in the translations workflow (SVN).

As for the overall purpose of this PR: we add language codes if Windows does not support it (see #8538 for a case for Kurdish). I expect Bengali (bn) is not a supported LCID by default, so it makes sense to proceed with this PR by adding it in LCID map. However, as I noted earlier, introducing a new language just as we are exchanging data between master and beta branches is risky. This is riskier now because:

  1. There is no word from Bengali translators about adjusting to translations workflow (I did review the threads noted here). If at least one message was translated, then I think it makes sense to proceed with this PR.
  2. Building from item 1, no messages were translated into Bengali at this time.
  3. If it turns out to be an issue with the translations workflow script, then it might be possible that other cases i.e. untranslated languages could emerge.

One thing I have learned as a translator and managing translations forum and workflow in general: timing and communication are very important. More importantly, willingness by translators to communicate with the community, learn about the workflow, and keeping translations updated and as accurate as possible (not just messages, but also being sensitive to cultural expectations of the local community) are important, especially considering that NVDA is promoted as a multilingual screen reader.

Therefore, in light of the situation here, and since Bengali translators are new to translations workflow and contribution in general, I propose to look at the check translation script to make sure this does not happen again. Practically speaking, I would like to suggest the following actions:

  1. Remove bn/nvda.po from NVDA beta branch.
  2. Check translations workflow script to see what's going on.
  3. Do not proceed with this PR until folks get communication going with Bengali translators. In other words, I propose not setting a milestone for this PR.
  4. In the near term (hopefully when NVDA 2022.1 beta is in circulation), contact translators listed in addresses database about their commitment to NVDA translations, specifically for languages that hasn't seen any activity whatsoever for at least a year. If translators say they are willing to maintain their languages, have a discussion with them so they can work on l10n data between 2022.1 release and 2022.2 beta. If not, consider deleting l10n data from SVN.

As for item 4, I think it is something that translations list should have a serious discussion about (note that I'm not part of the translations list anymore).

Thanks.

@lukaszgo1

Copy link
Copy Markdown
Contributor

After thinking about this for a while, it appears the script used to check translation progress didn't work as intended. Normally when merging in po files from Subversion (SVN) to Git, a script will ensure that at least 70% of messages were translated (in this case, at least 1805 translated messages).

Has this ever worked for translations to NVDA? Looking at scripts in mrConfig this check applies only to the translations for add-ons. I agree extending it to the .po files for NVDA is a right thing to do.

@josephsl

josephsl commented Feb 14, 2022 via email

Copy link
Copy Markdown
Contributor

@seanbudd

Copy link
Copy Markdown
Member Author

@CyrilleB79 -

  1. Could you describe in this PR how windowsPrimaryLCIDsToLocaleNames is obtained? The comment says:
    # Generated from: {x&0x3ff:y.split('_')[0] for x,y in locale.windows_locale.iteritems()}
    But:
    • This comment should at least be updated to Python 3 (iteritems -> items)
    • This comment does not seem true anymore since there are values above 1023 (0x3FF) in the dictionnary.

That's correct. I think this PR needs to be updated to be more explicit about how this is obtained.

  1. It would be interesting to explain in the comments or at least in the PR why adding this line is needed whereas there is already a bn item in the dictionnary:
    69: 'bn'
    The same for sr that appears two times (26 and 9242).

Hopefully this should be covered in the description of the variable. Languages can be mapped to many locales (eg en_AU, en_US). This is a map of locale identifiers in Windows, to just their language code (with other parts of the locale stripped).
In this case, this locale is used in the translation system, but is not listed by locale.windows_locale.items().

  1. Why do values above 1023 appear in this dictionary (1170: 'ckb', 1109: 'my', 1143: 'so')?
    These are additions as we have these locales in the translation system, but they are not listed by locale.windows_locale.items().

These items should probably be split out into windowsLCIDsToLocaleNames and used when they are needed.

We should probably avoid relying on LCIDs in future: LCID deprecation

@seanbudd

Copy link
Copy Markdown
Member Author

The existing work done in characterDescriptions.dic and cldr.dic is valuable, and an improvement to NVDA.
Instead, we can just hide bn from the UX.

@josephsl

josephsl commented Feb 15, 2022 via email

Copy link
Copy Markdown
Contributor

@josephsl

josephsl commented Feb 15, 2022 via email

Copy link
Copy Markdown
Contributor

@seanbudd

Copy link
Copy Markdown
Member Author

@josephsl
For now we propose keeping 'bn' in the UI. we already have several other languages with low completion, there isn't any harm for a user selecting 'bn' (missing translations fall back to English), and we don't want to have to manually reenable 'bn' later.

Also, discussions surrounding translations are better suited to a separate issue/discussion or the mailing list.

@josephsl

josephsl commented Feb 15, 2022 via email

Copy link
Copy Markdown
Contributor

@CyrilleB79

Copy link
Copy Markdown
Contributor

In the beginning, I have asked bn translators (Fahim FARHAN ISHRAK and Abu Faraj ) to continue e-mail exchanges privately to set up SVN and so on since they needed very basic instructions and I did not want to clutter the list with a lot of messages, the translator mailing list being described as a low-traffic mailing list.

But given our discussion here, I understand the need to have exchange with bn translators public on the translator mailing list. I will send a message to ask them to use the list instead of our private e-mail exchanges from now on.

@lukaszgo1

Copy link
Copy Markdown
Contributor

The existing work done in characterDescriptions.dic and cldr.dic is valuable, and an improvement to NVDA. Instead, we can just hide bn from the UX.

These weren't added by the new translations team for Bengali and therefore are out of scope here:

  • cldr.dict was generated automatically and was present in 2021.3
  • character descriptions also aren't new in 2022.1.

For now we propose keeping 'bn' in the UI. we already have several other languages with low completion, there isn't any harm for a user selecting 'bn' (missing translations fall back to English), and we don't want to have to manually reenable 'bn' later.

Introducing a new language without a single translated message is pretty illogical - for an end user who decides to switch their NVDAto Bengali it can even be perceived as a bug.

@seanbudd

Copy link
Copy Markdown
Member Author

Closing in favour of #13342.
Note that the addition of Bangla to beta is due to the translation system, and introducing it to alpha will happen with a future merge of beta to master.
These PRs are for improving locale support, in preparation of this.

@seanbudd seanbudd closed this Feb 16, 2022
@seanbudd seanbudd deleted the addBangla branch February 16, 2022 02:26
seanbudd added a commit that referenced this pull request Feb 17, 2022
Summary of the issue:
The latest PR (#13338) to merge beta to master has failed, due to Bangla translations being introduced and windowsPrimaryLCIDsToLocaleNames not listing the locale. AppVeyor build failure.

On #13339, @CyrilleB79 raised that this variable is poorly documented and outdated.
The initial variable provided a mapping of LCIDs to language codes, without the rest of the locale.
We now normalize the locale as necessary in normalizeLanguage instead.
On further inspection, it appears that the only additions from locale.windows_locale are as follows:

{
	1170: 'ckb',
	1109: 'my',
	1143: 'so',
+      2117: 'bn',
	9242: 'sr',
}
As locale.windows_locale is incomplete, these were introduced to ensure languages translated in NVDA could be mapped.
Instead, the Windows function LCIDToLocaleName can be used to get each of these locales.
This was suggested in #4203.

However Windows maps 1170 to "ku-Arab-IQ" not "ckb", and a translation is added for Central Kurdish in localesData.LANG_NAMES_TO_LOCALIZED_DESCS["ckb"]. NVDA may drop "Arab-IQ" from this locale to get the language, losing the locality of "Central Kurdish".

Description of how this pull request fixes the issue:
Removes windowsPrimaryLCIDsToLocaleNames, instead use LCIDToLocaleName after checking for an internal mapping (eg for "ckb").
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants