Skip to content

add pgettext to InternationalizationExtension#1126

Merged
ThiefMaster merged 1 commit intopallets:masterfrom
imomaliev:pgettext-support
Apr 5, 2021
Merged

add pgettext to InternationalizationExtension#1126
ThiefMaster merged 1 commit intopallets:masterfrom
imomaliev:pgettext-support

Conversation

@imomaliev
Copy link
Copy Markdown
Contributor

@imomaliev imomaliev commented Jan 13, 2020

Fixes #441

@ThiefMaster
Copy link
Copy Markdown
Member

I don't think this is useful without adding npgettext (for pluralization) as well.

Comment thread src/jinja/ext.py Outdated
@imomaliev
Copy link
Copy Markdown
Contributor Author

I don't think this is useful without adding npgettext (for pluralization) as well.

Yes also trans should be extended to accept context option

@ThiefMaster
Copy link
Copy Markdown
Member

Agreed! Good luck finding a decent syntax for this :)

@imomaliev imomaliev changed the title add pgettext to InternalizationExtension add pgettext to InternationalizationExtension Jan 13, 2020
@imomaliev
Copy link
Copy Markdown
Contributor Author

Agreed! Good luck finding a decent syntax for this :)

Thanks))

@imomaliev imomaliev marked this pull request as ready for review May 4, 2020 13:53
@imomaliev
Copy link
Copy Markdown
Contributor Author

imomaliev commented May 4, 2020

@ThiefMaster hi! Finally found time to finish this. This already works.

TODO:

  • discuss current syntax
  • docs
  • deprecation warnings for cases when context is used as variable in trans

Current syntax for trans tag

{% trans context "mycontext" %}Text{% endtrans%}
{% trans context "mycontext" num=3 %}One Text{% pluralize %}{{num}} Texts{% endtrans%}

@imomaliev imomaliev requested a review from ThiefMaster May 4, 2020 14:03
@ThiefMaster ThiefMaster removed their request for review May 28, 2020 09:08
Copy link
Copy Markdown

@wo0dyn wo0dyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @imomaliev. 👍 I'm looking forward to this.

@septatrix
Copy link
Copy Markdown
Contributor

Current syntax for trans tag

{% trans context "mycontext" %}Text{% endtrans%}
{% trans context "mycontext" num=3 %}One Text{% pluralize %}{{num}} Texts{% endtrans%}

I think having the context statement and the corresponding context string as two seperate strings seems problematic. It feels a lot more intuitive to have it be an assignment like context="mycontext". However this would make it impossible to use context as a variable inside the translation block. Is it currently possible to assign an expression to trimmed and use that inside the block?

@imomaliev
Copy link
Copy Markdown
Contributor Author

imomaliev commented Aug 13, 2020

@septatrix

Is it currently possible to assign an expression to trimmed and use that inside the block?

Yes it is allowed

jinja/tests/test_ext.py

Lines 378 to 384 in d754c88

def test_trimmed_varname_trimmed(self):
# unlikely variable name, but when used as a variable
# it should not enable trimming
tmpl = i18n_env.from_string(
"{%- trans trimmed = 'world' %} hello\n {{ trimmed }} {% endtrans -%}"
)
assert tmpl.render() == " hello\n world "

But I don't think it is same for context because trimmed is a flag, context should accept value, it could be made same as num where it depends on position

@remram44
Copy link
Copy Markdown

remram44 commented Aug 21, 2020

Since blocks are used for both comments ({# #}) and pluralization ({% pluralize %}), maybe the context could be done that way as well?

{# Some button #}
{% trans count=n %}
    {{ count }} thing
{% pluralize %}
    {{ count }} things
{% endtrans %}
{% trans %}
    Click me
{% context %}
    In annoying popup
{% endtrans %}

@imomaliev
Copy link
Copy Markdown
Contributor Author

@remram44 Context usually is one word, I feel using blocks for this case would be too verbose.

@remram44
Copy link
Copy Markdown

remram44 commented Aug 23, 2020

Possibly. Just offering an alternative, since adding a third different syntax in the tag (variables key="value", flags trimmed, now key "value") seems worse.

@imomaliev
Copy link
Copy Markdown
Contributor Author

@remram44 This could break some staff, but maybe it will be better to deprecate key="value" syntax and use key "value" instead for num=42 as well and other cases as such?

@septatrix
Copy link
Copy Markdown
Contributor

I agree in that blocks feel very verbose and if the whole section needs to be wrapped inside a context it would just boil down to

{% trans %}{% context "greeting" %}Hello World{% endtrans %}

Adding a third syntax into the mix also feels unnecessary though I think it may be the best we can do. Flags (like trimmed) are obviously gonna stay and deprecating key="value" in favor of key "value" seems very bad as it would just change the syntax to an arguably less readable one without any good reason.

IMO we should either just pass a context like any other variable using context="greeting" which would restrict the usage of context as a variable name inside the block (which you can easily get around) - or go for the third syntax as a space seperated value (context "greeting"). As long as the grammar only permits trimmed and context before any variable assignments I vote for the originally proposed syntax using a space separated assignment.

@remram44
Copy link
Copy Markdown

What about an option that doesn't use the keyword context?

  • {% trans "greeting" var="world" %}Hello {{var}}{% endtrans %}
  • {% trans(greeting) var="world" %}Hello {{var}}{% endtrans %}

Apologies for bike-shedding, anything works for me really (and I'm excited to get context support), I just want to make sure we don't get stuck on a non-optimal syntax.

The problem I see with the context "greeting" is that it is error-prone. Users are likely to get confused and do context="greeting" (which will work but silently ignore the context, causing i18n issues) or not understand why var "value" doesn't (at least there'd be an error there).

@ThiefMaster
Copy link
Copy Markdown
Member

@remram44 I really like the {% trans(context) ... %} syntax! Not sure if the Jinja parser supports it though...

@ThiefMaster
Copy link
Copy Markdown
Member

even though {% context "greeting" %} inside the trans block is not bad either. I would just enforce it to be right at the beginning or end and make it a syntax error anywhere else...

@ThiefMaster
Copy link
Copy Markdown
Member

hm, {% context %} is a bit tricky with pluralization. if we allow it at the end, would we also allow it inside the {% pluralize %} part? (having it in multiple places would have to be an error for sure)

@remram44
Copy link
Copy Markdown

remram44 commented Aug 27, 2020

"Inside" the pluralize part? That part doesn't have an end tag, it just ends at {% endtrans %}. The pluralize part would end when the context start of course...

{# Comment here #}
{% trans count=n %}
    {{ count }} message
{% pluralize %}
    {{ count }} messages
{% context %}
    Context information
{% endtrans %}

Whether we want to allow both orders is up to us (I think it's really weird to have context in between though):

{# Comment here #}
{% trans count=n %}
    {{ count }} message
{% context %}
    Context information
{% pluralize %}
    {{ count }} messages
{% endtrans %}

@ThiefMaster
Copy link
Copy Markdown
Member

ThiefMaster commented Aug 27, 2020

Looking at it again (especially after adding indentation and highlighting to your comment) I think it makes perfect sense to allow both orders. The first one with the context a the very end looks much better though.

FWIW, I was initially thinking about something like this:

{% trans count=n %}
    {% context "context information" %}
    {{ count }} message
{% pluralize %}
    {{ count }} messages
{% endtrans %}

which looks very weird now that i see it.

BTW I think when we use {% context %}context here we should always strip the context string, regardless of the trimmed setting of the trans tag itself or whether the whitespace-stripping/preserving jinja modifiers are being used.

@ThiefMaster
Copy link
Copy Markdown
Member

So I guess these two options would be the default (that would also be documented):

{% trans name='bar' %}
    Hello, {{ name }}!
{% context %}
    Context information
{% endtrans %}
{% trans count=n %}
    {{ count }} message
{% pluralize %}
    {{ count }} messages
{% context %}
    Context information
{% endtrans %}

I'd say for context before pluralize we can silently allow it if it just works, and in case supporting it makes the code more complex we explicitly disallow it.

@septatrix
Copy link
Copy Markdown
Contributor

BTW I think when we use {% context %}context here we should always strip the context string, regardless of the trimmed setting of the trans tag itself or whether the whitespace-stripping/preserving jinja modifiers are being used.

This should be consistent with similar tags like pluralize. I do not know how that handles whitespace - if it does not strip whitespace I think making an exception for a single tag will only lead to a confusing user experience (or make it an explicit setting).

I'd say for context before pluralize we can silently allow it if it just works, and in case supporting it makes the code more complex we explicitly disallow it.

I think having the context in between should probably be discouraged anyways as otherwise people may think it only applies to everything before.

Also I think passing the context information as the content of the block after a {% context %} tag may be confusing and lead to misunderstanding. At first I was thinking why you were using an empty context until I realized the context information is inside the block. It is also easy to think that the context information would be retained in the output as most other things outside of tags are (with the exception of if or similar). Finally this may also lead to people thinking that putting an expression inside the context section may be a good idea.
Therefore I am strongly in favour of trans(foobar) or context "foobar" as these will likely lead to less confusion

@imomaliev
Copy link
Copy Markdown
Contributor Author

imomaliev commented Sep 16, 2020

I like trans(foobar) variant it will help deprecate old syntax and make new unified one. This could be reused then for num and other arguments

@septatrix
Copy link
Copy Markdown
Contributor

I do not think deprecating any other current syntax is currently planned. However I think the syntax with {% context %} context here will be very confusing which is why I would prefer to put the context either inside the context {% context "context here" %}[1] or directly inside the trans tag {% trans("context here") %}[2] (with or without quotes I don't really care). The variant [1] is probably more similar to the current constructs in jinja while the variant [2] fells very intuitive in that it looks a partially applied function which it effectively is. Also the second variant feels like it would be easy to extend should there be any new additions in the future.

@septatrix
Copy link
Copy Markdown
Contributor

Most often the context will only be a single word. It is mostly used for single word translation which have similar meanings and you need to specify which one you want. Sentences as the string to be translated are self describing and sentences as a context are never a good idea nor necessary in my experience.

Here is a command which will search your local message catalogues for messages with a context specifier and the results on my pc:

$ locate "*.po" | xargs grep "msgctxt"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po:msgctxt "law"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po:msgctxt "good"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po:msgctxt "organization"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po:msgctxt "stick"

$ locate "*.po" | xargs grep -A 2 "msgctxt"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po:msgctxt "law"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po-msgid "right"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po-msgstr "le droit"
--
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po:msgctxt "good"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po-msgid "right"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po-msgstr "le bien"
--
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po:msgctxt "organization"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po-msgid "club"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po-msgid_plural "clubs"
--
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po:msgctxt "stick"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po-msgid "club"
/usr/lib/python3/dist-packages/tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po-msgid_plural "clubs"

@ThiefMaster
Copy link
Copy Markdown
Member

I don't think a library/framework is a good example - I'd be more interested in full-scale web applications. Does transifex have a good way to quickly scrape many translations of open source projects from their site?

@septatrix
Copy link
Copy Markdown
Contributor

septatrix commented Sep 17, 2020

That was from 1563 catalogues however most of them were empty. I just got a command which unpacks the .mo files and scans them and got a LOT more results...

find /usr/share/locale/de/LC_MESSAGES -name "*.mo" -exec msgunfmt "{}" \; | grep msgctxt
# for less noise:
find /usr/share/locale/de/LC_MESSAGES -name "*.mo" -exec msgunfmt "{}" \; | grep msgctxt | cut -d " " -f 2- | uniq

@remram44
Copy link
Copy Markdown

remram44 commented Oct 3, 2020

Another interesting data point: Django uses a context keyword with no equal sign:

{% blocktranslate with name=user.username context "greeting" %}Hi {{ name }}{% endblocktranslate %}

@pallets pallets deleted a comment from tvb Oct 7, 2020
@mcejp
Copy link
Copy Markdown

mcejp commented Feb 21, 2021

It seems that the patch to add pgettext & npgettext is straightforward, while the best template syntax is more contentious.

Perhaps users would benefit from splitting these two into separate issues and resolving at least the first one?

@mireq
Copy link
Copy Markdown

mireq commented Mar 18, 2021

@mcejp I think that would be the best solution so far. The inability to use context is one of the biggest problems of jinja2.
@imomaliev Would you make a new pull request with only pgettext / npgettext syntax? I think it could be accepted without any problems.

@imomaliev
Copy link
Copy Markdown
Contributor Author

@ThiefMaster Hi, should we separate this PR as @mcejp suggested? If this is ok I could cleanup this PR or open new one with just pgettext and npgettext

@ThiefMaster
Copy link
Copy Markdown
Member

I think that would be a good idea. Usually it's the short strings (likely without any html etc in them) that need contexts the most, so having them available in general would already help a lot there.

@imomaliev
Copy link
Copy Markdown
Contributor Author

imomaliev commented Mar 20, 2021

Updated. This PR now contains only short syntax. docs are failing because sphinx depends on jinja https://github.com/sphinx-doc/sphinx/blob/7ecf6b88aa5ddaed552527d2ef60f1bd35e98ddc/sphinx/jinja2glue.py#L189

@imomaliev
Copy link
Copy Markdown
Contributor Author

cc @ThiefMaster

Comment thread src/jinja2/ext.py Outdated
@imomaliev
Copy link
Copy Markdown
Contributor Author

@ThiefMaster Also added docs

Comment thread tests/test_ext.py Outdated
@imomaliev
Copy link
Copy Markdown
Contributor Author

@ThiefMaster updated

@davidism davidism added this to the 3.0.0 milestone Mar 27, 2021
Copy link
Copy Markdown
Member

@ThiefMaster ThiefMaster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides some minor documentation changes I think this PR is ready to be merged.

Code looks fine and I tested usage + extraction in my application without any problems.

Comment thread docs/extensions.rst Outdated
Comment thread docs/extensions.rst Outdated
Comment thread docs/extensions.rst
@ThiefMaster
Copy link
Copy Markdown
Member

Oh, one last thing I forgot: Could you please add a changelog entry to CHANGES.rst?

@imomaliev
Copy link
Copy Markdown
Contributor Author

imomaliev commented Apr 5, 2021

@ThiefMaster Added. I am not sure about wording. Let me know if there is a better way describe this.

Comment thread CHANGES.rst Outdated
Comment thread CHANGES.rst Outdated
@davidism
Copy link
Copy Markdown
Member

davidism commented Apr 5, 2021

I'm doing a pass to squash commits and check docs/formatting, don't push or merge yet.

@imomaliev
Copy link
Copy Markdown
Contributor Author

@davidism Note that I updated last commit

@davidism
Copy link
Copy Markdown
Member

davidism commented Apr 5, 2021

Squashed and force pushed. If you need to make further changes:

git fetch
git reset --hard FETCH_HEAD

@ThiefMaster ThiefMaster merged commit 43d4228 into pallets:master Apr 5, 2021
@github-actions github-actions Bot locked as resolved and limited conversation to collaborators Apr 20, 2021
@imomaliev imomaliev deleted the pgettext-support branch July 15, 2022 11:43
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add pgettext function to i18n extension

10 participants