SQL: Introduce an async querying mode for SQL by bpintea · Pull Request #73267 · elastic/elasticsearch

bpintea · 2021-05-20T10:29:23Z

This adds an async query mode to SQL.
It (re)uses the same request and response async-specific EQL object
parameters.

Also similar to EQL, the running search task can have its state
monitored and canceled and its results stored and deleted, with
intermediary responses not supported (the entire result is available
once search finished).

The initial query and subsequent pagination/scrolling requests will both
be started in the async mode.

This adds an async query mode to SQL. It (re)uses the same request and response async-specific EQL object parameters. Also similar to EQL, the running search task can have its state monitored and canceled and its results stored and deleted, with intermediary responses not supported (the entire result is available once search finished). The initial query and subsequent pagination/scrolling requests will both be started in the async mode.

bpintea · 2021-05-20T10:37:28Z

@elasticmachine ok to test

bpintea · 2021-05-20T10:41:31Z

Note: text formatting not yet supported, that will be applied with a subsequent PR.

elasticmachine · 2021-05-20T10:43:25Z

Pinging @elastic/es-ql (Team:QL)

imotov

LGTM in general. I think we can move some classes that we moved to xpack.ql.async all the way to xpack.core.async since ML expressed desire to also use them.

imotov · 2021-05-26T19:20:30Z

...vileges-tests/src/javaRestTest/java/org/elasticsearch/xpack/security/operator/Constants.java

        "cluster:monitor/xpack/searchable_snapshots/stats",
        "cluster:monitor/xpack/security/saml/metadata",
        "cluster:monitor/xpack/spatial/stats",
+        SQL_ASYNC_GET_STATUS_ACTION_NAME,


I am really curious why we don't use constants for all other actions here. @ywangd or @tvernum could you clarify?

The main reason is that the list is generated when the test is first introduced, i.e. the test reports all missing names when this list is empty. Also a few other reasons:

Many names are not available to this test, e.g. autoscaling, enrich, grok processor, reindex

Some names are computed, e.g. all the xpack usage actions and shard level actions ([s]) and quite a few others

I find a sorted list of consistent literal strings easier to read and useful to scan through

Sometimes it is useful to repeat the literal string in tests which can itself act as a test.

So, it sounds like we have several good enough reasons to continue using strings here instead of constants, and we should probably continue doing it in this case.

Replaced the var with the string (and added the var location as a comment).

imotov · 2021-05-26T19:33:23Z

x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/async/StoredAsyncResponse.java

 */

-package org.elasticsearch.xpack.eql.async;
+package org.elasticsearch.xpack.ql.async;


I would keep this generic and move it into org.elasticsearch.xpack.core.async so ML could use it. The same for StoredAsyncTask and

Thanks, @imotov!
Will wait one more review and depending on the amount of changes required will move the classes either part of this, or of a next PR that I have to open anyways.

The two classes have been moved.

imotov · 2021-05-26T19:33:54Z

x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/async/StoredAsyncTask.java

        taskManager.cancelTaskAndDescendants(this, reason, true, ActionListener.wrap(runnable));
    }
+
+    public static QlStatusResponse getStatusResponse(StoredAsyncTask<?> asyncTask) {


I would introduce it one level above so we can keep this task generic and usable by ML.

matriv

LGTM, Nice work! Left a couple of minor comments.

matriv · 2021-06-02T10:07:25Z

x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/async/QlStatusResponse.java

-            Long startTimeMillis,
-            long expirationTimeMillis,
-            RestStatus completionStatus) {
+    public interface AsyncStatus {


Imho, I would extract this to its own file.

I find the interface tightly bound to the outer class (i.e. no independent applicability), that's why I was also using it as a prefix wherever the interface is used. But if you think it'd be better practice to extract it, I can still rename it and do it.

(Thanks for this long-PR review!)

I'll leave it up to you, not insisting, it's just that from the name AsyncStatus it seems like a generic iface and could also belong to a separate file.

matriv · 2021-06-02T10:11:35Z

...r/security/src/test/java/org/elasticsearch/xpack/sql/qa/security/RestSqlSecurityAsyncIT.java

+        testCase("user2", "user1");
+    }
+
+    private void testCase(String user, String other) throws Exception {


nit, to make it more clear:

Suggested change

private void testCase(String user, String other) throws Exception {

private void testCase(String user, String otherUser) throws Exception {

Generally I've tried to keep as close to the original EQL tests as possible (since many of them can't be deduplicated EQL-SQL). But this is small, I'll change it.

…sync

- move StoredAsyncResponse and StoredAsyncTask classes from ql to core.async packages. - rename function parameter.

Replace constant with hardcoded string.

mark-vieira · 2021-06-02T20:23:41Z

jenkins re-test this please

astefan

LGTM with some minor comments.

astefan · 2021-06-02T16:36:20Z

x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/async/QlStatusResponse.java

+            builder.field("id", id);
+            builder.field("is_running", isRunning);
+            builder.field("is_partial", isPartial);
+            if (startTimeMillis != null) { // start time is available only for a running eql search


eql -> eql/sql?

astefan · 2021-06-02T16:36:29Z

x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/async/QlStatusResponse.java

+                builder.timeField("start_time_in_millis", "start_time", startTimeMillis);
+            }
+            builder.timeField("expiration_time_in_millis", "expiration_time", expirationTimeMillis);
+            if (isRunning == false) { // completion status is available only for a completed eql search


eql -> eql/sql?
There are, also, other eql mentions in the file.

Thanks, will fix them with the following PR.

astefan · 2021-06-02T19:26:57Z

x-pack/plugin/sql/src/main/java/org/elasticsearch/xpack/sql/execution/search/Querier.java

            l = new ScrollActionListener(listener, client, cfg, output, query);
        }

+        if (cfg.task() != null && cfg.task().isCancelled()) {


Why checking for a cancelled task here (before the actual search and after the listener flavor is created) and not at the start of the method? Does it matter what kind of listener has its onFailure method called for a TaskCancelledException?

There are a couple of places where the task is checked against being canceled within SQL, both placed before "leaving" SQL:

before resolving the index;

here, before running the search.

It makes sense to do it there because (1) compared to the distributed part, the SQL code will run quick; and (2) there's anyways a race between the cancelation and task execution -- one could add more checkpoints throughout SQL, but it won't practically improve much (like reactivity to the cancelation).

bpintea mentioned this pull request May 20, 2021

SQL: Introduce an async querying mode for SQL #72166

Closed

bpintea marked this pull request as ready for review May 20, 2021 10:38

bpintea added the :Analytics/SQL SQL querying label May 20, 2021

elasticmachine added the Team:QL (Deprecated) Meta label for query languages team label May 20, 2021

bpintea requested review from astefan, costin, imotov and matriv May 20, 2021 10:43

imotov approved these changes May 26, 2021

View reviewed changes

matriv reviewed Jun 2, 2021

View reviewed changes

matriv approved these changes Jun 2, 2021

View reviewed changes

bpintea added 3 commits June 2, 2021 20:34

Merge remote-tracking branch 'upstream/feat/sql-async' into feat/ql_a…

61ad073

…sync

Address review comments

3002e4e

- move StoredAsyncResponse and StoredAsyncTask classes from ql to core.async packages. - rename function parameter.

Adress review comment

76f9dd1

Replace constant with hardcoded string.

astefan approved these changes Jun 2, 2021

View reviewed changes

bpintea merged commit 45a4ec8 into elastic:feat/sql-async Jun 3, 2021

bpintea deleted the feat/ql_async branch June 3, 2021 16:02

	private void testCase(String user, String other) throws Exception {
	private void testCase(String user, String otherUser) throws Exception {

Conversation

bpintea commented May 20, 2021

Uh oh!

bpintea commented May 20, 2021

Uh oh!

bpintea commented May 20, 2021

Uh oh!

elasticmachine commented May 20, 2021

Uh oh!

imotov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matriv left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mark-vieira commented Jun 2, 2021

Uh oh!

astefan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants