Inherit symbol level for languages other than English#4
Merged
Conversation
…t from the level below.
Contributor
Author
|
@seanbudd, @feerrenrut please take into account this request; and let me know if the procedure is OK for this repo. Thanks. |
seanbudd
approved these changes
Dec 18, 2022
seanbudd
left a comment
Member
There was a problem hiding this comment.
Thanks @CyrilleB79 for fixes this. LGTM
github-actions Bot
pushed a commit
that referenced
this pull request
Dec 18, 2022
Commit message: Use symbolLevel=none only for English; for other languages, inherit it from the level below. (#4) Fixes nvaccess/nvda#14417 Summary of the issue: Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place. But there is a Hindi CLDR file. Thus the symbol level for symbols such as common punctuation (dot, question marke, etc.) is the one of CLDR, i.e. none. This is not adapted and it would be better to take advantage of the symbol levels that are defined in the English symbol file. Description of user facing changes In NVDA, locale CLDR dic file inherits symbol levels from the files coming after, i.e. English symbols and English CLDR. In case the locale symbol file does not define a character's level, this allows to: Use the level for this symbol if it is defined there Use "none" (coming from English CLDR dic file) if the character is not defined in English symbol file but is defined in CLDR. Description of development approach For all languages except English ("en"), generate the cldr.dic file with "-" in the level field, meaning that the level is inherited from previous files. For English cldr.dic file, use "none" for the symbol level, as it was already before this PR.
CyrilleB79
added a commit
to CyrilleB79/nvda
that referenced
this pull request
Dec 19, 2022
…evel is set to none. Fixes nvaccess#14417 Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4
CyrilleB79
added a commit
to CyrilleB79/nvda
that referenced
this pull request
Dec 19, 2022
… punctuation level Fixes nvaccess#14417 Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4
6 tasks
Contributor
Author
|
Thanks for the merge. Note: due to the GitHub magic word added in the initial description, nvaccess/nvda#14417 was closed a bit too early. Instead, it should be closed when nvaccess/nvda#14459 is merged. @seanbudd I think in the future one should not use GitHub magic words to target PRs that are not in the same repository. |
seanbudd
pushed a commit
to nvaccess/nvda
that referenced
this pull request
Dec 21, 2022
… punctuation level (#14459) Fixes #14417 Summary of the issue: Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place. But there is a Hindi CLDR file. Thus the symbol level for symbols such as common punctuation (dot, question marke, etc.) is the one of CLDR, i.e. none. This is not adapted and it would be better to take advantage of the symbol levels that are defined in the English symbol file. Description of user facing changes CLDR data will be available for languages which had no symbol file (am, et, kk, ne, th, ur) or empty symbol file (hi). For these languages, since there are no locale symbol file definition, the level defined in the English symbol file will be honoured. Description of development approach Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4.
6 tasks
seanbudd
pushed a commit
to nvaccess/nvda
that referenced
this pull request
Mar 23, 2023
… punctuation level (2nd attempt) (#14558) A first PR (#14459) had been merged to fix #14417. Unfortunately an issue was found (see #14473) so it has been reverted in #14477. This PR is a second attempt to fix #14417 without causing #14473. It will remain a draft until I can have more information on #14473 from @OzancanKaratas, as requested in #14473 (comment), or from anyone else able to reproduce. Link to issue number: Fixes #14417 Summary of the issue: Preliminary note for review Keep in mind the following: in NVDA with CLDR enabled and with no custom user symbol defined, symbol level for symbol X is defined as follows: look at locale symbol file: If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file. look at locale CLDR file: If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file. look at English symbol file: If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file. look at English CLDR file: If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, use default symbol level (don't remember if it is None or All). Description of the issue Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place. But there is a Hindi CLDR file. Currently, CLDR files are generated with level "None" for all symbols. Usually, in locales with a CLDR file and a normal symbol files, less common characters that are only in CLDR are reported at level None, i.e. whatever the punctuation level setting of the user. But common punctuation symbols (dot, question marke, etc.) are added by translators in the locale symbol file what allows to have these symbols reported at a higher punctuation level. For Hindi (or any language with no current symbol translated), all the characters present in CLDR file are reported at "None" level and above (i.e. at any level), because the level is not redefined in the locale (Hindi) symbol file. In such situation, using the level of the locale CLDR (None) is not a good strategy. It would be better to take advantage of the levels defined for the symbols in the English symbol file. Description of user facing changes CLDR data will be available for languages which had no symbol file (am, et, kk, ne, th, ur) or empty symbol file (hi). For these languages, since there are no locale symbol file definition, the level defined in the English symbol file will be honoured. Description of development approach Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Link to issue number:
Fixes nvaccess/nvda#14417
Summary of the issue:
Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place.
But there is a Hindi CLDR file. Thus the symbol level for symbols such as common punctuation (dot, question marke, etc.) is the one of CLDR, i.e. none.
This is not adapted and it would be better to take advantage of the symbol levels that are defined in the English symbol file.
Description of user facing changes
In NVDA, locale CLDR dic file inherits symbol levels from the files coming after, i.e. English symbols and English CLDR. In case the locale symbol file does not define a character's level, this allows to:
Description of development approach
cldr.dicfile with "-" in the level field, meaning that the level is inherited from previous files.cldr.dicfile, use "none" for the symbol level, as it was already before this PR.Testing strategy:
Manual tests:
cldr.dicfiles with:py -3.7-32 build.pyKnown issues with pull request:
Change log entries:
N/A in this repo.
Code Review Checklist:
It rather applies to nvda's repo, but I keep it here in case these checks are found useful.