Remove lenient URL parameter parsing#20722
Remove lenient URL parameter parsing#20722jasontedor merged 17 commits intoelastic:masterfrom jasontedor:strict-rest-params
Conversation
Today when parsing a request, Elasticsearch silently ignores incorrect (including parameters with typos) or unused parameters. This is bad as it leads to requests having unintended behavior (e.g., if a user hits the _analyze API and misspell the "tokenizer" then Elasticsearch will just use the standard analyzer, completely against intentions). This commit removes lenient URL parameter parsing. The strategy is simple: when a request is handled and a parameter is touched, we mark it as such. Before the request is actually executed, we check to ensure that all parameters have been consumed. If there are remaining parameters yet to be consumed, we fail the request with a list of the unconsumed parameters. An exception has to be made for parameters that format the response (as opposed to controlling the request); for this case, handlers are able to provide a list of parameters that should be excluded from tripping the unconsumed parameters check because those parameters will be used in formatting the response. Additionally, some inconsistencies between the parameters in the code and in the docs are corrected.
|
|
||
| // validate unconsumed params, but we must exclude params used to format the response | ||
| // we copy because we do not want to modify the request params | ||
| final Set<String> unconsumedParams = new HashSet<>(request.unconsumedParams()); |
There was a problem hiding this comment.
Maybe this should be a list instead?
| return params; | ||
| } | ||
|
|
||
| Set<String> unconsumedParams() { |
There was a problem hiding this comment.
If you document that this returns a copy then there should be no need to copy a second time.
| */ | ||
| protected abstract Runnable doRequest(RestRequest request, RestChannel channel, NodeClient client) throws Exception; | ||
|
|
||
| protected Set<String> responseParams() { |
There was a problem hiding this comment.
Sure, I can do that. There was a comment on the Javadoc for BaseRestHandler#doRequest in this regard, but I agree that it would be nice to have it here too.
| // error_trace cannot be used when we disable detailed errors | ||
| if (channel.detailedErrorsEnabled() == false && request.paramAsBoolean("error_trace", false)) { | ||
| // we consume the error_trace parameter first to ensure that it is always consumed | ||
| if (request.paramAsBoolean("error_trace", false) && channel.detailedErrorsEnabled() == false) { |
There was a problem hiding this comment.
Nice! Short circuiting if now causes problems with unconsumed request parameters!
| builder.startObject("indices"); | ||
| for (Map.Entry<String, Map<String, FieldStats>> entry1 : | ||
| response.getIndicesMergedFieldStats().entrySet()) { | ||
| response.getIndicesMergedFieldStats().entrySet()) { |
| import java.util.Arrays; | ||
| import java.util.Collections; | ||
| import java.util.HashSet; | ||
| import java.util.Set; |
There was a problem hiding this comment.
Are these leftovers from a previous iteration?
| validateQueryRequest.query( | ||
| RestActions.getQueryContent(RestActions.getRestContent(request), indicesQueriesRegistry, parseFieldMatcher)); | ||
| } catch (ParsingException e) { | ||
| return () -> handleException(channel, validateQueryRequest, e.getDetailedMessage()); |
There was a problem hiding this comment.
I think it should be fine to just throw the exception here. So long as we haven't gone async this should be safe. We do it in other places. I guess the only difference is whether we report errors in the query first or extra parameters first which isn't really a big deal either way to me.
This does get me to thinking - maybe we should return a Consumer<RestChannel> instead of a Runanble? Maybe doRequest shouldn't have the variable it needs to send a response so it is never tempted?
Also the name doRequest. Maybe prepare or something? I'm much better at saying "I don't like that name" than I am at making a better name.
There was a problem hiding this comment.
To rename doRequest to prepareRequest I pushed 5a2e83f. I'm still thinking about the consumer suggestion.
There was a problem hiding this comment.
Regarding just letting the exception bubble up, I don't think so because we format the response nicely; see RestValidateQueryAction#buildErrorResponse.
| private static final Set<String> RESPONSE_PARAMS; | ||
|
|
||
| static { | ||
| final Set<String> responseParams = new HashSet<>(Arrays.asList("local", "health")); |
There was a problem hiding this comment.
This is like the one place where I miss Guava. It was really nice and fluent to make these.
There was a problem hiding this comment.
Java 9 will fix this: http://openjdk.java.net/jeps/269
| public Runnable doRequest(final RestRequest request, final RestChannel channel, final NodeClient client) throws Exception { | ||
| boolean helpWanted = request.paramAsBoolean("help", false); | ||
| if (helpWanted) { | ||
| Table table = getTableWithHeader(request); |
There was a problem hiding this comment.
It feels like this if block out to be inside the Runnable.
| channel.sendResponse(RestTable.buildResponse(table, channel)); | ||
| } catch (Exception e) { | ||
| try { | ||
| channel.sendResponse(new BytesRestResponse(channel, e)); |
There was a problem hiding this comment.
Maybe it is ok to just through from here? That might mean declaring a new functional interface?
This commit adds unit tests for strict URL parameter parsing, including test cases for response parameters.
This commit adds Javadocs for BaseRequestHandler#responseParams to clarify the intent of classes overriding this method.
This commit refactors RestRequest#unconsumedParams to return a list, for simplicity, and clarifies that callers are free to modify the returned list.
This commit removes some unused imports from o.e.r.a.a.i.RestGetSettingsAction.
This commit addresses a formatting issue in o.e.r.a.a.i.RestGetIndicesAction that arose from a method's parameters being indented at the same level as the method's body.
This commit renames the method BaseRequestHandler#doRequest to BRH#prepareRequest as this method is merely preparing the request for execution.
This commit refactors BaseRequestHandler#prepareRequest to return a RestChannelConsumer instead of a runnable. This removes the need for REST request handlers to take a RestChannel parameter. Additionally, the checked exception is tightened to just be an IOException instead of the top-level Exception.
s1monw
left a comment
There was a problem hiding this comment.
it looks great, one thing that I am concerned about is that if we push this into 5.x we will cause things to fail on the client side when users upgrade. The other option is to push it only to 6.0 but then it will take way to long to make it to the user. I am very torn since it's a big change to push it 5.0 either. :/
| final Set<String> responseParams = responseParams(); | ||
| final Iterator<String> it = unconsumedParams.iterator(); | ||
|
|
||
| // this has to use an iterator lest a ConcurrentModificationException will arise |
There was a problem hiding this comment.
not sure I get this comment maybe we shouldn't modify the params at all in that way but create a copy set of the unconsumed ones?
* master: Optimized LatLon sorting does not work in the descending order. IndicesAliasesRequest should not implement CompositeIndicesRequest (#20726) Upgrade microbenchmarks to JMH 1.15 Build: Use more general archivesBaseName setting, to ensure pom uses matching file name Update favicon (#20727) Skip prereleases for restore bwc tests too Skip prereleases in static bwc tests
This commit fixes an issue in the validate query handler where an exception during parsing the query would still trigger attempting to validate the query. These two cases should be mutually exclusive.
| action.accept(channel); | ||
| } | ||
|
|
||
| @FunctionalInterface |
There was a problem hiding this comment.
Nit: I think this belongs under the javadoc.
There was a problem hiding this comment.
I thought this would violate the Javadoc linter? I will verify.
There was a problem hiding this comment.
It doesn't make the linter mad, so I pushed a1f065e.
| return handlePost(request, client); | ||
| } | ||
| return () -> {}; | ||
| return channel -> {}; |
There was a problem hiding this comment.
Maybe better to just throw an UnsupportedOperationException here.
There was a problem hiding this comment.
Rather an IllegalArgumentException; UnsupportedOperationException means "there are no parameters you can pass here to make this method execute" but IllegalArgumentException says "there are parameters you can pass here to make this method execute, just not the parameters that you gave me". I get your point though, and will push a commit. (Also, this is in should-never-happen territory, since the handler will not be invoked without the method being GET or POST.)
| throw new UncheckedIOException(e); | ||
| } | ||
| private void handleException( | ||
| final ValidateQueryRequest validateQueryRequest, |
There was a problem hiding this comment.
I pushed 309fca4. Well, I didn't indent like you wanted, just made the problem go away. 😇
|
I'm as torn as @s1monw on versioning. I suppose getting this into 5.1 violates semver because garbage on the URL would be identified. I'd love to have it in 5.0 but I think it violates the spirit of code freeze to try and do it. It is super tempting though. |
This commit simplifies the computation of the unconsumed parameters remaining when handling a REST request.
This commit moves a FunctionalInterface annotation under the Javadoc on BaseRestHandler$RestChannelConsumer because it looks better that way.
The /_upgrade REST handler accepts GET and POST requests, but would silently ignore invocations for other HTTP verbs. This commit modifies this handler so that an illegal argument exception is thrown instead.
This commit fixes some obnoxious indentation on the method parameters for RestValidateQueryAction#handleException.
|
After conversation with @nik9000 and @s1monw, we have decided the leniency here is big enough a bug that we want to ship this code as early as possible. The options on the table were:
We felt that 5.1.0 is not a viable option, it is too breaking of a change during a minor release, and waiting until 6.0.0 is waiting too long to fix this bug so we will ship this in 5.0.0. |
Today when parsing a request, Elasticsearch silently ignores incorrect (including parameters with typos) or unused parameters. This is bad as it leads to requests having unintended behavior (e.g., if a user hits the _analyze API and misspell the "tokenizer" then Elasticsearch will just use the standard analyzer, completely against intentions). This commit removes lenient URL parameter parsing. The strategy is simple: when a request is handled and a parameter is touched, we mark it as such. Before the request is actually executed, we check to ensure that all parameters have been consumed. If there are remaining parameters yet to be consumed, we fail the request with a list of the unconsumed parameters. An exception has to be made for parameters that format the response (as opposed to controlling the request); for this case, handlers are able to provide a list of parameters that should be excluded from tripping the unconsumed parameters check because those parameters will be used in formatting the response. Additionally, some inconsistencies between the parameters in the code and in the docs are corrected. Relates #20722
Today when parsing a request, Elasticsearch silently ignores incorrect (including parameters with typos) or unused parameters. This is bad as it leads to requests having unintended behavior (e.g., if a user hits the _analyze API and misspell the "tokenizer" then Elasticsearch will just use the standard analyzer, completely against intentions). This commit removes lenient URL parameter parsing. The strategy is simple: when a request is handled and a parameter is touched, we mark it as such. Before the request is actually executed, we check to ensure that all parameters have been consumed. If there are remaining parameters yet to be consumed, we fail the request with a list of the unconsumed parameters. An exception has to be made for parameters that format the response (as opposed to controlling the request); for this case, handlers are able to provide a list of parameters that should be excluded from tripping the unconsumed parameters check because those parameters will be used in formatting the response. Additionally, some inconsistencies between the parameters in the code and in the docs are corrected. Relates #20722
Today when parsing a request, Elasticsearch silently ignores incorrect
(including parameters with typos) or unused parameters. This is bad as
it leads to requests having unintended behavior (e.g., if a user hits
the _analyze API and misspell the "tokenizer" then Elasticsearch will
just use the standard analyzer, completely against intentions).
This commit removes lenient URL parameter parsing. The strategy is
simple: when a request is handled and a parameter is touched, we mark it
as such. Before the request is actually executed, we check to ensure
that all parameters have been consumed. If there are remaining
parameters yet to be consumed, we fail the request with a list of the
unconsumed parameters. An exception has to be made for parameters that
format the response (as opposed to controlling the request); for this
case, handlers are able to provide a list of parameters that should be
excluded from tripping the unconsumed parameters check because those
parameters will be used in formatting the response.
Additionally, some inconsistencies between the parameters in the code
and in the docs are corrected.
Closes #14719