Correct warning header to be compliant#23275
Conversation
The warning header used by Elasticsearch for delivering deprecation
warnings has a specific format (RFC 7234, section 5.5). The format
specifies that the warning header should be of the form
warn-code warn-agent warn-text [warn-date]
Here, the warn-code is a three-digit code which communicates various
meanings. The warn-agent is a string used to identify the source of the
warning (either a host:port combination, or some other identifier). The
warn-text is quoted string which conveys the semantic meaning of the
warning. The warn-date is an optional quoted date that can be in a few
different formats.
This commit corrects the warning header within Elasticsearch to follow
this specification. We use the warn-code 299 which means a
"miscellaneous persistent warning." For the warn-agent, we use the
version of Elasticsearch that produced the warning. The warn-text is
unchanged from what we deliver today, but is wrapped in quotes as
specified (this is important as a problem that exists today is that
multiple warnings can not be split by comma to obtain the individual
warnings as the warnings might themselves contain commas). For the
warn-date, we use the RFC 1123 format.
| WARNING_FORMAT, | ||
| formattedMessage, | ||
| DateTimeFormatter.RFC_1123_DATE_TIME.format(ZonedDateTime.now(GMT))); | ||
| assert WARNING_HEADER_PATTERN.matcher(warningValue).matches(); |
There was a problem hiding this comment.
we also test that the format is correct in our test infra, maybe this assertion is not needed then?
There was a problem hiding this comment.
Given that it caught an issue I fixed in 0d5310e, I'd prefer that it remain. The issue was the double quotes in the deprecation message.
There was a problem hiding this comment.
I am curious on how that came up. I guess through an IT test rather than a unit test?
There was a problem hiding this comment.
+1 to keep it as we rely on it for deduping. stronger yet - I think we should verify we extract the value
9ce55b2 to
08e6abd
Compare
bleskes
left a comment
There was a problem hiding this comment.
I think the basics are good (and it's always fun to go read the spec of HTTP related stuff). I think we missed encoding / decoding slashes and quotes in the message text.
|
|
||
| /* | ||
| * RFC7234 specifies the warning format as warn-code <space> warn-agent <space> "warn-text" [<space> "warn-date"]. Here, warn-code is a | ||
| * three-digit number with various standard warn codes specified. The warn code 299 is apt for our purposes as it represents a |
There was a problem hiding this comment.
why did you choose 299 over 199?
There was a problem hiding this comment.
To indicate that this is a persistent warning as opposed to a transient warning (e.g., a warning that could expire after time elapses, or on cache invalidation).
| public static Pattern WARNING_HEADER_PATTERN = Pattern.compile( | ||
| "299 " + // warn code | ||
| "Elasticsearch-\\d+\\.\\d+\\.\\d+(?:-(?:alpha|beta|rc)\\d+)?(?:-SNAPSHOT)?/(?:[a-f0-9]{7}|Unknown) " + // warn agent | ||
| "\"([^\"]*)\" " + // quoted warning value, captured |
There was a problem hiding this comment.
this doesn't account for slash escaping.
| WARNING_FORMAT, | ||
| formattedMessage, | ||
| DateTimeFormatter.RFC_1123_DATE_TIME.format(ZonedDateTime.now(GMT))); | ||
| assert WARNING_HEADER_PATTERN.matcher(warningValue).matches(); |
There was a problem hiding this comment.
+1 to keep it as we rely on it for deduping. stronger yet - I think we should verify we extract the value
| final Iterator<ThreadContext> iterator = threadContexts.iterator(); | ||
|
|
||
| if (iterator.hasNext()) { | ||
| final String formattedMessage = LoggerMessageFormat.format(message, params); |
There was a problem hiding this comment.
we also need to escape quotes and slashes in the text, no?
There was a problem hiding this comment.
I pushed a commit that does this.
|
|
||
| if (existingValues != null) { | ||
| if (existingValues.contains(value)) { | ||
| final Set<String> existingUniqueValues = existingValues.stream().map(uniqueValue).collect(Collectors.toSet()); |
There was a problem hiding this comment.
assert that existingUniqueValues have the same length as existingValues? we should use deduping consistently...
There was a problem hiding this comment.
I pushed a commit that does this.
| } | ||
|
|
||
| final String now = DateTimeFormatter.RFC_1123_DATE_TIME.format(ZonedDateTime.now(ZoneId.of("GMT"))); | ||
| final Function<String, String> deduplicator = v -> { |
There was a problem hiding this comment.
should we make this a static value somewhere in DeprecationLogger to make sure we test it as well?
There was a problem hiding this comment.
I pushed a commit that does this, and added tests.
| final List<String> actualWarnings = threadContext.getResponseHeaders().get(DeprecationLogger.WARNING_HEADER); | ||
| final List<String> actualWarnings = threadContext.getResponseHeaders().get("Warning"); | ||
| for (String msg : expectedWarnings) { | ||
| assertThat(actualWarnings, hasItem(containsString(msg))); |
There was a problem hiding this comment.
we should do the extraction song and dance (+encoding/decoding) here too, no?
There was a problem hiding this comment.
Agree, I pushed a commit that does this.
* master: (54 commits) Keep the pipeline handler queue small initially Do not create String instances in 'Strings' methods accepting StringBuilder (elastic#22907) Tests: fix AwsS3ServiceImplTests Remove abstract InternalMetricsAggregation class (elastic#23326) Add BulkRequest support to High Level Rest client (elastic#23312) Wrap getCredentials() in a doPrivileged() block (elastic#23297) Respect promises on pipelined responses Align REST specs for HEAD requests Remove unnecessary result sorting in SearchPhaseController (elastic#23321) Fix SamplerAggregatorTests to have stable and predictable docIds Tests: Ensure multi node integ tests wait on first node Relocate a comment in HttpPipeliningHandler Add comments to HttpPipeliningHandler [TEST] Fix incorrect test cluster name in cluster health doc tests Build: Change location in zip of license and notice inclusion for plugins (elastic#23316) Script: Fix value of `ctx._now` to be current epoch time in milliseconds (elastic#23175) Build: Rework integ test setup and shutdown to ensure stop runs when desired (elastic#23304) Handle long overflow when adding paths' totals Don't set local node on cluster state used for node join validation (elastic#23311) Ensure that releasing listener is called ...
* master: (26 commits) CLI: Fix prompting for yes/no to handle console returning null (elastic#23320) Tests: Fix reproduce line for packagingTest (elastic#23365) Build: Remove extra copies of netty license (elastic#23361) [TEST] Removes timeout based wait_for_active_shards REST test (elastic#23360) [TEST] increase timeout slightly in wait_for_active_shards test to allow for index creation cluster state update to be processed before ensuring the wait times out Handle snapshot repository's missing index.latest Adding equals/hashCode to MainResponse (elastic#23352) Always restore the ThreadContext for operations delayed due to a block (elastic#23349) Add support for named xcontent parsers to high level REST client (elastic#23328) Add unit tests for ParentToChildAggregator (elastic#23305) Fix after last merge with master and apply last comments [INGEST] Lazy load the geoip databases. disable BWC tests for the highlighters, need a new 5.x build to make it work Expose WordDelimiterGraphTokenFilter (elastic#23327) Test that buildCredentials returns correct clazz (elastic#23334) Add BreakIteratorBoundaryScanner support for FVH (elastic#23248) Prioritize listing index-N blobs over index.latest in reading snapshots (elastic#23333) Test: Fix hdfs test fixture setup on windows delete and index tests can share some part of the code Remove createSampleDocument method and use the sync'ed index method ...
bleskes
left a comment
There was a problem hiding this comment.
LGTM. Left some minor comments that I don't feel require another cycle.
| public static Pattern WARNING_HEADER_PATTERN = Pattern.compile( | ||
| "299 " + // warn code | ||
| "Elasticsearch-\\d+\\.\\d+\\.\\d+(?:-(?:alpha|beta|rc)\\d+)?(?:-SNAPSHOT)?-(?:[a-f0-9]{7}|Unknown) " + // warn agent | ||
| "\"((?:\t| |!|[\\x23-\\x5b]|[\\x5d-\\x7e]|[\\x80-\\xff]|\\\\|\")*)\" " + // quoted warning value, captured |
There was a problem hiding this comment.
I'm not sure this is right (although it doesn't matter since we only parse our own stuff, for which it works). Currently it will match 299 Elasticsearch-6.0.0-alpha1-SNAPSHOT-Unknown """ "Sun, 26 Feb 2017 20:13:14 GMT" . I think |\")*) should be |\\\")*)
There was a problem hiding this comment.
I'm not sure this is right
You're correct that it's not right.
I think
|\")*)should be|\\\")*)
I think it should be |\\\\\")*).
| final String formattedMessage = LoggerMessageFormat.format(message, params); | ||
|
|
||
| final String warningValue = formatWarning(formattedMessage); | ||
| assert WARNING_HEADER_PATTERN.matcher(warningValue).matches(); |
There was a problem hiding this comment.
should we extractWarningValueFromWarningHeader to see we got formattedMessage back?
| * @return the escaped string | ||
| */ | ||
| public static String escape(String s) { | ||
| return s.replaceAll("(\\\\|\")", "\\\\$1"); |
There was a problem hiding this comment.
java really does need literal strings..
| final String s = randomAsciiOfLength(16); | ||
| final String first = DeprecationLogger.formatWarning(s); | ||
| // force the clock forward by a second so the dates on the warning values are different | ||
| Thread.sleep(1000); |
There was a problem hiding this comment.
:( . Maybe add a variant of formatWarning where you can supply the time? alternatively we can make sure what we extract is the value of s which is probably a better test anyway (we want the original, not that we consistently break things). It seems we don't use escaped chars anyway
| assertThat(DeprecationLogger.escape("\\\""), equalTo("\\\\\\\"")); | ||
| assertThat(DeprecationLogger.escape("\"foo\\bar\""),equalTo("\\\"foo\\\\bar\\\"")); | ||
| // test that characters other than '\' and '"' are left unchanged | ||
| final String s = DeprecationLogger.escape(randomAsciiOfLength(16).replace("\\", "").replace("\"", "")); |
There was a problem hiding this comment.
it doesn't seem that randomAsciiOfLength will give back those chars, ever.
There was a problem hiding this comment.
Oh, it's poorly named then. 😦
| } | ||
| } | ||
|
|
||
| protected final void assertSettingDeprecations(Setting... settings) { |
| if (false == expected.isEmpty()) { | ||
| if (failureMessage == null) { | ||
| failureMessage = new StringBuilder(); | ||
| final Set<String> expected = new LinkedHashSet<>(expectedWarningHeaders); |
There was a problem hiding this comment.
we should escape expectedWarningHeaders, or alternatively unescape when we read.
* master: [TEST] make headers case-insensitive when running yaml tests [TEST] randomize request content_type between all of the supported formats [TEST] add support for binary responses to REST tests infra [TEST] don't check exact size in mapper-size yaml test [TEST] move test for binary field to specific test file that sets Content-Type header explicitly [TEST] move filters aggs wrapper query builder rewriting test to integ tests [TEST] create HttpEntity earlier in REST tests [TEST] Remove content type auto-detection while parsing request body in REST tests Factor out filling of TopDocs in SearchPhaseController (elastic#23380) Add info method to High Level Rest client (elastic#23350) Prevent negative `from` parameter in SearchSourceBuilder (elastic#23358) reduce the number of iterations in testPrimaryRelocationWhileIndexing and flush every 5 rollback unneeded change in testNotifyOnDisconnect disable sampling in testNotifyOnDisconnect
The warning header used by Elasticsearch for delivering deprecation
warnings has a specific format (RFC 7234, section 5.5). The format
specifies that the warning header should be of the form
warn-code warn-agent warn-text [warn-date]
Here, the warn-code is a three-digit code which communicates various
meanings. The warn-agent is a string used to identify the source of the
warning (either a host:port combination, or some other identifier). The
warn-text is quoted string which conveys the semantic meaning of the
warning. The warn-date is an optional quoted date that can be in a few
different formats.
This commit corrects the warning header within Elasticsearch to follow
this specification. We use the warn-code 299 which means a
"miscellaneous persistent warning." For the warn-agent, we use the
version of Elasticsearch that produced the warning. The warn-text is
unchanged from what we deliver today, but is wrapped in quotes as
specified (this is important as a problem that exists today is that
multiple warnings can not be split by comma to obtain the individual
warnings as the warnings might themselves contain commas). For the
warn-date, we use the RFC 1123 format.
Relates #23275
The warning header used by Elasticsearch for delivering deprecation
warnings has a specific format (RFC 7234, section 5.5). The format
specifies that the warning header should be of the form
warn-code warn-agent warn-text [warn-date]
Here, the warn-code is a three-digit code which communicates various
meanings. The warn-agent is a string used to identify the source of the
warning (either a host:port combination, or some other identifier). The
warn-text is quoted string which conveys the semantic meaning of the
warning. The warn-date is an optional quoted date that can be in a few
different formats.
This commit corrects the warning header within Elasticsearch to follow
this specification. We use the warn-code 299 which means a
"miscellaneous persistent warning." For the warn-agent, we use the
version of Elasticsearch that produced the warning. The warn-text is
unchanged from what we deliver today, but is wrapped in quotes as
specified (this is important as a problem that exists today is that
multiple warnings can not be split by comma to obtain the individual
warnings as the warnings might themselves contain commas). For the
warn-date, we use the RFC 1123 format.
Relates #23275
|
Thanks for the thorough review @bleskes. |
The warning header used by Elasticsearch for delivering deprecation warnings has a specific format (RFC 7234, section 5.5). The format specifies that the warning header should be of the form
Here, the warn-code is a three-digit code which communicates various meanings. The warn-agent is a string used to identify the source of the warning (either a host:port combination, or some other identifier). The warn-text is quoted string which conveys the semantic meaning of the warning. The warn-date is an optional quoted date that can be in a few different formats.
This commit corrects the warning header within Elasticsearch to follow this specification. We use the warn-code 299 which means a "miscellaneous persistent warning." For the warn-agent, we use the version of Elasticsearch that produced the warning. The warn-text is unchanged from what we deliver today, but is wrapped in quotes as specified (this is important as a problem that exists today is that multiple warnings can not be split by comma to obtain the individual warnings as the warnings might themselves contain commas). For the warn-date, we use the RFC 1123 format.
Closes #22986