Skip to content

[BEAM-2958] Adding a top-level user agent string to PipelineOptions#3915

Closed
youngoli wants to merge 8 commits intoapache:masterfrom
youngoli:bugfix-2958
Closed

[BEAM-2958] Adding a top-level user agent string to PipelineOptions#3915
youngoli wants to merge 8 commits intoapache:masterfrom
youngoli:bugfix-2958

Conversation

@youngoli
Copy link
Copy Markdown
Contributor

Follow this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a JIRA issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
  • Each commit in the pull request should have a meaningful subject line and body.
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Run mvn clean verify to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

Adding a string to PipelineOptions that can be used for setting a user agent string with information about the distribution of Beam being used. That string can be sent to external services too.

This PR is probably not completely done yet. While the actual user agent string is in and working, more might have to be done to replace BigTableIO and DataflowRunner usages of user agents to use this new one. I'm still working on that right now.

@youngoli
Copy link
Copy Markdown
Contributor Author

R: @lukecwik

@lukecwik
Copy link
Copy Markdown
Member

retest this please

Copy link
Copy Markdown
Member

@lukecwik lukecwik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, LGTM after comments and passing Jenkins maven clean install.

Do you plan to make the BQIO/Dataflow Runner changes in separate PRs?

*/
@Description("A user agent string describing the pipeline to external services. "
+ "The default name is \"[name]/[version]\""
+ " where name and version are properties of the Beam release.")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of the Beam release. -> of the Apache Beam release.


/**
* Returns a user agent string constructed from {@link ReleaseInfo#getName()} and
* {@link ReleaseInfo#getVersion()}, in the format "[name]/[version]".
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"[name]/[version]" -> {@code [name]/[version]}

Copy link
Copy Markdown
Member

@lukecwik lukecwik Sep 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"[name]/[version]" -> {@code [name]/[version]}

@Override
public String create(PipelineOptions options) {
ReleaseInfo info = ReleaseInfo.getReleaseInfo();
return
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: whitespace around return looks awkward

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage decreased (-0.0003%) to 69.572% when pulling d716696 on youngoli:bugfix-2958 into da531b7 on apache:master.

For Bigtable I modified it to set the user agent in BigtableOptions
from the user agent in PipelineOptions, rather than directly from
ReleaseInfo.

For DataflowRunner I overwrite the default user agent when setting up
the DataflowOptions, using the name and version from DataflowRunnerInfo.
Currently that should result in the same user agent as the default
because DataflowRunnerInfo just inherits the name and version from
ReleaseInfo. But if that name and version ever changes for Dataflow
it will reflect in the user agent.
@Override
public String create(PipelineOptions options) {
ReleaseInfo info = ReleaseInfo.getReleaseInfo();
return String.format("%s/%s", info.getName(), info.getVersion()).replace(" ", "_");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we replace " " with "_"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how BigtableIO originally had it when reading from ReleaseInfo. Why it did that replacement I'm not sure.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out they are following https://www.ietf.org/rfc/rfc2616.txt where:

user agent         = 1*(product | comment)
product            = token ["/" product-version]
product-version    = token

Add to the user agent comment that this field should represent a user agent as per RFC2616 and give the BNF form that I describe above. A token is limited to any characters not including a separator.

@youngoli
Copy link
Copy Markdown
Contributor Author

retest this please

Copy link
Copy Markdown
Member

@lukecwik lukecwik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments.

* product = token ["/" product-version]
* product-version = token
* </code></pre>
* Where a token is a series of without a separator.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awkward sentence: Where a token is a series of without a separator.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, forgot a word.

* </code></pre>
* Where a token is a series of without a separator.
*
* <p>The string defaults to {@code [name]/[version]} based on the properties of the Beam release.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beam -> Apache Beam


/**
* Returns a user agent string constructed from {@link ReleaseInfo#getName()} and
* {@link ReleaseInfo#getVersion()}, in the format "[name]/[version]".
Copy link
Copy Markdown
Member

@lukecwik lukecwik Sep 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"[name]/[version]" -> {@code [name]/[version]}

@lukecwik
Copy link
Copy Markdown
Member

lukecwik commented Oct 3, 2017

Failed in known Gearpump test issue. LGTM, merging

@asfgit asfgit closed this in ba5bee6 Oct 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants