Skip to content

In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level#14459

Merged
seanbudd merged 3 commits into
nvaccess:masterfrom
CyrilleB79:updateCLDR
Dec 21, 2022
Merged

In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level#14459
seanbudd merged 3 commits into
nvaccess:masterfrom
CyrilleB79:updateCLDR

Conversation

@CyrilleB79

@CyrilleB79 CyrilleB79 commented Dec 19, 2022

Copy link
Copy Markdown
Contributor

Link to issue number:

Fixes #14417

Summary of the issue:

Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place.
But there is a Hindi CLDR file. Thus the symbol level for symbols such as common punctuation (dot, question marke, etc.) is the one of CLDR, i.e. none.
This is not adapted and it would be better to take advantage of the symbol levels that are defined in the English symbol file.

Description of user facing changes

CLDR data will be available for languages which had no symbol file (am, et, kk, ne, th, ur) or empty symbol file (hi). For these languages, since there are no locale symbol file definition, the level defined in the English symbol file will be honoured.

Description of development approach

Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4.

Testing strategy:

  • In the document in NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none #14417, tested that punctuation are not reported when symbol level is None
  • For Hindi (hi) and Amharic (am) with eSpeak, tested the following punctuation and emojis: _),🐍
    • The emoji 🐍 is reported with a non-English word even at level None
    • The symbols _ ) , are reported in non-English words only at higher punctuation level
      Note: I do not speak these languages, so I just check that the reported word is not English; in Hindi however, "comma" is pronounced the same way as in English but written with Hindi characters.
      Note2: Hindi is a language with existing symbol file but empty; Amharic is a language with no symbol file.

Known issues with pull request:

It's the first time I make a PR updating a submodule. Also, the nvda-cldr repository usage seems to be to use the commit generated by the GitHub action in this repo, but I have not found the documentation confirming this (would be worth adding if missing).

So please double-check the process that I have used.

Cc @OzancanKaratas for your experience with CLDR updates.

Change log entries:

Bug fixes

  1. Replace:
    Emojis should now be reported in more languages. (#14433)
    by:
    Emojis should now be reported in more languages. (#14433, #14459)
  2. Add:
    In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level (#14417)

Code Review Checklist:

  • Pull Request description:
    • description is up to date
    • change log entries
  • Testing:
    • Unit tests
    • System (end to end) tests
    • Manual testing
  • API is compatible with existing add-ons.
  • Documentation:
    • User Documentation
    • Developer / Technical Documentation
    • Context sensitive help for GUI changes
  • UX of all users considered:
    • Speech
    • Braille
    • Low Vision
    • Different web browsers
    • Localization in other languages / culture than English
  • Security precautions taken.

… punctuation level

Fixes nvaccess#14417

Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4

@OzancanKaratas OzancanKaratas left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CyrilleB79, this look good to me!

@CyrilleB79 CyrilleB79 marked this pull request as ready for review December 19, 2022 22:22
@CyrilleB79 CyrilleB79 requested a review from a team as a code owner December 19, 2022 22:22
@seanbudd

Copy link
Copy Markdown
Member

It's the first time I make a PR updating a submodule. Also, the nvda-cldr repository usage seems to be to use the commit generated by the GitHub action in this repo, but I have not found the documentation confirming this (would be worth adding if missing).

It seems to be missing, though I don't know if there was ever documentation. A file like espeak.md except cldr.md would be helpful to have in the include repo. A new issue should probably be created to track this.

@seanbudd seanbudd merged commit 8746c6f into nvaccess:master Dec 21, 2022
@nvaccessAuto nvaccessAuto added this to the 2023.1 milestone Dec 21, 2022
@OzancanKaratas

OzancanKaratas commented Dec 24, 2022

Copy link
Copy Markdown
Collaborator

This pull request may have fixed the issue in Hindi, but it caused the issue in other languages such as Turkish. If the user do not review using object navigation, NVDA cannot read any of the characters specified in CLDR.

I suggest revert this pull request. @seanbudd, @CyrilleB79: What are your thoughts?

@CyrilleB79

Copy link
Copy Markdown
Contributor Author

This pull request may have fixed the issue in Hindi, but it caused the issue in other languages such as Turkish. If the user do not review using object navigation, NVDA cannot read any of the characters specified in CLDR.

I suggest revert this pull request. @seanbudd, @CyrilleB79: What are your thoughts?

Thanks for reporting this. Would you mind opening a specific issue with detailed steps so that we can try to reproduce?
If confirmed on our side and unless a quick and obvious fix is available, the best solution will probably be to revert this until we fix it again.

seanbudd added a commit that referenced this pull request Dec 27, 2022
…ever the punctuation level (#14459)"

This reverts commit 8746c6f.
seanbudd pushed a commit that referenced this pull request Mar 23, 2023
… punctuation level (2nd attempt) (#14558)

A first PR (#14459) had been merged to fix #14417. Unfortunately an issue was found (see #14473) so it has been reverted in #14477.

This PR is a second attempt to fix #14417 without causing #14473. It will remain a draft until I can have more information on #14473 from @OzancanKaratas, as requested in #14473 (comment), or from anyone else able to reproduce.

Link to issue number:
Fixes #14417

Summary of the issue:
Preliminary note for review
Keep in mind the following: in NVDA with CLDR enabled and with no custom user symbol defined, symbol level for symbol X is defined as follows:

look at locale symbol file:
If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file.
look at locale CLDR file:
If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file.
look at English symbol file:
If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file.
look at English CLDR file:
If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, use default symbol level (don't remember if it is None or All).
Description of the issue
Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place. But there is a Hindi CLDR file.

Currently, CLDR files are generated with level "None" for all symbols.

Usually, in locales with a CLDR file and a normal symbol files, less common characters that are only in CLDR are reported at level None, i.e. whatever the punctuation level setting of the user. But common punctuation symbols (dot, question marke, etc.) are added by translators in the locale symbol file what allows to have these symbols reported at a higher punctuation level.

For Hindi (or any language with no current symbol translated), all the characters present in CLDR file are reported at "None" level and above (i.e. at any level), because the level is not redefined in the locale (Hindi) symbol file.

In such situation, using the level of the locale CLDR (None) is not a good strategy. It would be better to take advantage of the levels defined for the symbols in the English symbol file.

Description of user facing changes
CLDR data will be available for languages which had no symbol file (am, et, kk, ne, th, ur) or empty symbol file (hi). For these languages, since there are no locale symbol file definition, the level defined in the English symbol file will be honoured.

Description of development approach
Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none

4 participants