[BEAM-2958] Adding a top-level user agent string to PipelineOptions#3915
[BEAM-2958] Adding a top-level user agent string to PipelineOptions#3915youngoli wants to merge 8 commits intoapache:masterfrom
Conversation
|
R: @lukecwik |
|
retest this please |
lukecwik
left a comment
There was a problem hiding this comment.
Minor comments, LGTM after comments and passing Jenkins maven clean install.
Do you plan to make the BQIO/Dataflow Runner changes in separate PRs?
| */ | ||
| @Description("A user agent string describing the pipeline to external services. " | ||
| + "The default name is \"[name]/[version]\"" | ||
| + " where name and version are properties of the Beam release.") |
There was a problem hiding this comment.
of the Beam release. -> of the Apache Beam release.
|
|
||
| /** | ||
| * Returns a user agent string constructed from {@link ReleaseInfo#getName()} and | ||
| * {@link ReleaseInfo#getVersion()}, in the format "[name]/[version]". |
There was a problem hiding this comment.
"[name]/[version]" -> {@code [name]/[version]}
| @Override | ||
| public String create(PipelineOptions options) { | ||
| ReleaseInfo info = ReleaseInfo.getReleaseInfo(); | ||
| return |
There was a problem hiding this comment.
nit: whitespace around return looks awkward
For Bigtable I modified it to set the user agent in BigtableOptions from the user agent in PipelineOptions, rather than directly from ReleaseInfo. For DataflowRunner I overwrite the default user agent when setting up the DataflowOptions, using the name and version from DataflowRunnerInfo. Currently that should result in the same user agent as the default because DataflowRunnerInfo just inherits the name and version from ReleaseInfo. But if that name and version ever changes for Dataflow it will reflect in the user agent.
| @Override | ||
| public String create(PipelineOptions options) { | ||
| ReleaseInfo info = ReleaseInfo.getReleaseInfo(); | ||
| return String.format("%s/%s", info.getName(), info.getVersion()).replace(" ", "_"); |
There was a problem hiding this comment.
Is there a reason why we replace " " with "_"?
There was a problem hiding this comment.
That's how BigtableIO originally had it when reading from ReleaseInfo. Why it did that replacement I'm not sure.
There was a problem hiding this comment.
It turns out they are following https://www.ietf.org/rfc/rfc2616.txt where:
user agent = 1*(product | comment)
product = token ["/" product-version]
product-version = token
Add to the user agent comment that this field should represent a user agent as per RFC2616 and give the BNF form that I describe above. A token is limited to any characters not including a separator.
Accidentally used a @code tag in a string instead of javadoc.
|
retest this please |
| * product = token ["/" product-version] | ||
| * product-version = token | ||
| * </code></pre> | ||
| * Where a token is a series of without a separator. |
There was a problem hiding this comment.
Awkward sentence: Where a token is a series of without a separator.
There was a problem hiding this comment.
Whoops, forgot a word.
| * </code></pre> | ||
| * Where a token is a series of without a separator. | ||
| * | ||
| * <p>The string defaults to {@code [name]/[version]} based on the properties of the Beam release. |
|
|
||
| /** | ||
| * Returns a user agent string constructed from {@link ReleaseInfo#getName()} and | ||
| * {@link ReleaseInfo#getVersion()}, in the format "[name]/[version]". |
|
Failed in known Gearpump test issue. LGTM, merging |
Follow this checklist to help us incorporate your contribution quickly and easily:
[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue.mvn clean verifyto make sure basic checks pass. A more thorough check will be performed on your pull request automatically.Adding a string to PipelineOptions that can be used for setting a user agent string with information about the distribution of Beam being used. That string can be sent to external services too.
This PR is probably not completely done yet. While the actual user agent string is in and working, more might have to be done to replace BigTableIO and DataflowRunner usages of user agents to use this new one. I'm still working on that right now.