Skip to content

i18n: Traverse through parent paths to find translator comment#1068

Merged
aduth merged 3 commits intomasterfrom
update/string-extraction-tolerance
Jun 12, 2017
Merged

i18n: Traverse through parent paths to find translator comment#1068
aduth merged 3 commits intomasterfrom
update/string-extraction-tolerance

Conversation

@aduth
Copy link
Copy Markdown
Member

@aduth aduth commented Jun 7, 2017

Related: #984 (comment)

This pull request seeks to improve translation string extraction to traverse through parent paths in the AST to find any leading comment which occurs on the same or previous line.

Testing instructions:

Ensure tests pass:

npm test

Run npm run gettext-strings and check that strings observed at #984 (comment) are included in languages/gutenberg.pot output.

cc @njpanderson

@aduth aduth added the Internationalization (i18n) Issues or PRs related to internationalization efforts label Jun 7, 2017
@njpanderson
Copy link
Copy Markdown
Contributor

FWIW This worked for me and loading the POT file in Poedit displayed the relevant comments alongside the translation template for those strings.

return (
line >= _originalNodeLine - 1 &&
line <= _originalNodeLine &&
REGEXP_TRANSLATOR_COMMENT.test( commentNode.value )
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not eliminate the duplicate test and match against this regex?

const commentMatch = find( ... => {
    return (
        ... &&
        commentNode.value.match( REGEXP_TRANSLATOR_COMMENT )
    )
} );

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An earlier iteration did this, but alas I encountered that find returns the entry (i.e. entire node) for which the callback returns true, not the return value itself. I might poke around for alternatives though, or if you have any suggestions.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, because find takes a predicate function rather than a map-like callback. 👍

}

it( 'should not return translator comment on same line but after call expression', () => {
const comment = getCommentFromString( '__( \'Hello world\' ); // translators: Greeting' );
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a potentially useful variation, or is the intent to match existing behavior in WordPress?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, this was based on a cursory scan of PHP source files always including the translator comment before the translation occurs. I'm not sure if there's in-fact a rule against it, or a limitation of the tooling. I'd have to check.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So after scouring the documentation, the code itself, and its tests, I've come to the conclusions that:

  • The comment must always occur before the translation call (see $latest_comment in find_function_calls token iteration)
  • The PHP parsing is very non-particular about when that comment occurs. In fact, it looks like it could be many many lines prior the translate call, even if unrelated statements occur in-between. I'm not sure we care to preserve this behavior. It helps that the AST parsing is at least very liberal with considering whitespace in leadingComments.
  • In the PHP implementation the translators: is case-insensitive, which we should probably preserve here. I'll make that change shortly.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we're already using the i flag for the RegExp matching. Should be fine there.

@aduth
Copy link
Copy Markdown
Member Author

aduth commented Jun 9, 2017

It helps that the AST parsing is at least very liberal with considering whitespace in leadingComments.

I guess this doesn't really help, as we test the comment to be occurring on the same or previous line. We might be able to drop this condition for iterating leadingComments, but would need something for traversing up through parent paths (so we don't e.g. consider leading comments on the wrapping function). Seems like it'd be nicer to receive an array of tokens like in the PHP implementation so we know the order in which comments occur, rather than traversing up the tree.

@nylen
Copy link
Copy Markdown
Member

nylen commented Jun 9, 2017

The PHP parsing is very non-particular about when that comment occurs. In fact, it looks like it could be many many lines prior the translate call, even if unrelated statements occur in-between. I'm not sure we care to preserve this behavior.

I agree.

It helps that the AST parsing is at least very liberal with considering whitespace in leadingComments. [...] I guess this doesn't really help, as we test the comment to be occurring on the same or previous line.

I think this is fine - it seems a bit strange at the very least to put a blank line in between the translator comment and the (directly relevant) translation call.

@aduth aduth merged commit c6d745d into master Jun 12, 2017
@aduth aduth deleted the update/string-extraction-tolerance branch June 12, 2017 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Internationalization (i18n) Issues or PRs related to internationalization efforts

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants