SQL: Introduce an async querying mode for SQL#73267
SQL: Introduce an async querying mode for SQL#73267bpintea merged 4 commits intoelastic:feat/sql-asyncfrom
Conversation
This adds an async query mode to SQL. It (re)uses the same request and response async-specific EQL object parameters. Also similar to EQL, the running search task can have its state monitored and canceled and its results stored and deleted, with intermediary responses not supported (the entire result is available once search finished). The initial query and subsequent pagination/scrolling requests will both be started in the async mode.
|
@elasticmachine ok to test |
|
Note: text formatting not yet supported, that will be applied with a subsequent PR. |
|
Pinging @elastic/es-ql (Team:QL) |
imotov
left a comment
There was a problem hiding this comment.
LGTM in general. I think we can move some classes that we moved to xpack.ql.async all the way to xpack.core.async since ML expressed desire to also use them.
| "cluster:monitor/xpack/searchable_snapshots/stats", | ||
| "cluster:monitor/xpack/security/saml/metadata", | ||
| "cluster:monitor/xpack/spatial/stats", | ||
| SQL_ASYNC_GET_STATUS_ACTION_NAME, |
There was a problem hiding this comment.
The main reason is that the list is generated when the test is first introduced, i.e. the test reports all missing names when this list is empty. Also a few other reasons:
- Many names are not available to this test, e.g. autoscaling, enrich, grok processor, reindex
- Some names are computed, e.g. all the xpack usage actions and shard level actions (
[s]) and quite a few others - I find a sorted list of consistent literal strings easier to read and useful to scan through
- Sometimes it is useful to repeat the literal string in tests which can itself act as a test.
There was a problem hiding this comment.
So, it sounds like we have several good enough reasons to continue using strings here instead of constants, and we should probably continue doing it in this case.
There was a problem hiding this comment.
Replaced the var with the string (and added the var location as a comment).
| */ | ||
|
|
||
| package org.elasticsearch.xpack.eql.async; | ||
| package org.elasticsearch.xpack.ql.async; |
There was a problem hiding this comment.
I would keep this generic and move it into org.elasticsearch.xpack.core.async so ML could use it. The same for StoredAsyncTask and
There was a problem hiding this comment.
Thanks, @imotov!
Will wait one more review and depending on the amount of changes required will move the classes either part of this, or of a next PR that I have to open anyways.
There was a problem hiding this comment.
The two classes have been moved.
| taskManager.cancelTaskAndDescendants(this, reason, true, ActionListener.wrap(runnable)); | ||
| } | ||
|
|
||
| public static QlStatusResponse getStatusResponse(StoredAsyncTask<?> asyncTask) { |
There was a problem hiding this comment.
I would introduce it one level above so we can keep this task generic and usable by ML.
matriv
left a comment
There was a problem hiding this comment.
LGTM, Nice work! Left a couple of minor comments.
| Long startTimeMillis, | ||
| long expirationTimeMillis, | ||
| RestStatus completionStatus) { | ||
| public interface AsyncStatus { |
There was a problem hiding this comment.
Imho, I would extract this to its own file.
There was a problem hiding this comment.
I find the interface tightly bound to the outer class (i.e. no independent applicability), that's why I was also using it as a prefix wherever the interface is used. But if you think it'd be better practice to extract it, I can still rename it and do it.
(Thanks for this long-PR review!)
There was a problem hiding this comment.
I'll leave it up to you, not insisting, it's just that from the name AsyncStatus it seems like a generic iface and could also belong to a separate file.
| testCase("user2", "user1"); | ||
| } | ||
|
|
||
| private void testCase(String user, String other) throws Exception { |
There was a problem hiding this comment.
nit, to make it more clear:
| private void testCase(String user, String other) throws Exception { | |
| private void testCase(String user, String otherUser) throws Exception { |
There was a problem hiding this comment.
Generally I've tried to keep as close to the original EQL tests as possible (since many of them can't be deduplicated EQL-SQL). But this is small, I'll change it.
- move StoredAsyncResponse and StoredAsyncTask classes from ql to core.async packages. - rename function parameter.
Replace constant with hardcoded string.
|
jenkins re-test this please |
astefan
left a comment
There was a problem hiding this comment.
LGTM with some minor comments.
| builder.field("id", id); | ||
| builder.field("is_running", isRunning); | ||
| builder.field("is_partial", isPartial); | ||
| if (startTimeMillis != null) { // start time is available only for a running eql search |
| builder.timeField("start_time_in_millis", "start_time", startTimeMillis); | ||
| } | ||
| builder.timeField("expiration_time_in_millis", "expiration_time", expirationTimeMillis); | ||
| if (isRunning == false) { // completion status is available only for a completed eql search |
There was a problem hiding this comment.
eql -> eql/sql?
There are, also, other eql mentions in the file.
There was a problem hiding this comment.
Thanks, will fix them with the following PR.
| l = new ScrollActionListener(listener, client, cfg, output, query); | ||
| } | ||
|
|
||
| if (cfg.task() != null && cfg.task().isCancelled()) { |
There was a problem hiding this comment.
Why checking for a cancelled task here (before the actual search and after the listener flavor is created) and not at the start of the method? Does it matter what kind of listener has its onFailure method called for a TaskCancelledException?
There was a problem hiding this comment.
There are a couple of places where the task is checked against being canceled within SQL, both placed before "leaving" SQL:
- before resolving the index;
- here, before running the search.
It makes sense to do it there because (1) compared to the distributed part, the SQL code will run quick; and (2) there's anyways a race between the cancelation and task execution -- one could add more checkpoints throughout SQL, but it won't practically improve much (like reactivity to the cancelation).
This adds an async query mode to SQL.
It (re)uses the same request and response async-specific EQL object
parameters.
Also similar to EQL, the running search task can have its state
monitored and canceled and its results stored and deleted, with
intermediary responses not supported (the entire result is available
once search finished).
The initial query and subsequent pagination/scrolling requests will both
be started in the async mode.