Skip to content

L10N (possible release blocker): messages with plural forms and/or formatted string literals with translated messages do not translate correctly under some languages #12417

@josephsl

Description

@josephsl

Hi,

I think this is a release blocker (@LeonarddeR, any ideas about what I'm describing below):

Background:

GNU Gettext defines ngettext function for specifying singular and plural forms of messages based on the value stored inside a variable. The signature is:

gettext.ngettext(singular, plural, value)

For example, in English:

gettext.ngettext("test", "tests", testValue)

Then:

  • testValue == 1: test
  • testValue == 2: tests

Whereas in languages such as Korean with one value for singular and plural forms:

  • testValue == 1: test
  • testValue == 2: test

So far, translation is working.

Problems:

Three major problems:

  1. Translated text, together with the value, is used to build formatted string literals for actual output. If spaces are inserted between variable value and translated string, it causes text to be shown incorrectly for languages that expect no spaces between value and text (e.g. Korean). For example, if we have f"{number} {message}" with number being 2 and message being "tests", respectivley, then it becomes "2 tests" in English and "2test" in Korean; if the formatted string literal is output as is, it becomes "2 test" in Korean which goes against spacing rules in Korean text.
  2. Some plural text may contain string interpolations. This causes an issue with languages with no distinction between singular and plural forms (e.g. Korean) where the translator cannot translate singular and plural form together. The ideal solution is just translating plural form. An example is "1 test" (singular) versus "{tests} tests" (plural) in English, whereas in Korean both texts can be translated as "{tests} test".
  3. Some languages define different plural rules as opposed to common ones such as singular versus plural (English) or single text (Korean). A good example is Russian where up to three text forms can be defined to describe singular and plural forms.

Steps to reproduce:

  1. Attempt to translate latest alpha build (ercently merged into beta branch). First, use scons pot to generate the catalog template, then apply it to a po file for languages other than English (try Korean).
  2. Attempt to translate text (for example, "category" and "categories").

Actual behavior:

Text cannot be translated into some languages (e.g. Korean).

Expected behavior:

NVDA announces translated messages.

System configuration

NVDA installed/portable/running from source:

Installed

NVDA version:

alpha-22762,1007f9e2

Windows version:

Windows 10 Version 21H1 (build 19043)

Name and version of other software in use when reproducing the issue:

Poedit 1.8.9

Other information about your system:

NVDA development workstation, also involved in translating messages into Korean.

Other questions

Does the issue still occur after restarting your computer?

Yes

Have you tried any other versions of NVDA? If so, please report their behaviors.

Works correctly in 2020.4 as there is one-to-one mapping between messages and translations.

If add-ons are disabled, is your problem still occurring?

Yes (this also affects add-ons that call gettext.ngettext).

Does the issue still occur after you run the COM Registration Fixing Tool in NVDA's tools menu?

Not applicable

Impact and possible mitigations:

This may become a release blocker - people who do not speak English may feel new messages were not translated when in fact they were translated.

One possible mitigation is reverting gettext.ngettext call when announcing message categories, or directing speakers of affected languages to apply specific rules about plural texts that might be contradictory to what they have known. Also, if gettext.ngettext is to be used going forward, to avoid spacing issues, I advise making actual output messages translatable.

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    component/i18nexisting localisations or internationalisation

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions