Skip to content

LOGICAL_ERROR: Next task callback is not set for query#84753

Closed
CheSema wants to merge 23 commits intomasterfrom
chesema-task-callback
Closed

LOGICAL_ERROR: Next task callback is not set for query#84753
CheSema wants to merge 23 commits intomasterfrom
chesema-task-callback

Conversation

@CheSema
Copy link
Copy Markdown
Member

@CheSema CheSema commented Jul 30, 2025

a new test 03579_create_table_populate_from_s3 catches
DB::Exception: Next task callback is not set for query .
Stateless tests (amd_binary, old analyzer, s3 storage, DatabaseReplicated, parallel)

It turned out that all *Cluster functions were broken with replicated DDL.

/// QueryKind is overused.
/// It is used to distinguish queries that are executed on worker nodes in the following cases:
/// 1. When a query is executed on a worker node in a distributed query with parallel *MergeTree replicas
/// 2. When a query is executed on a worker node in *Cluster functions like S3Cluster, etc.
/// 3. When a query is executed on a worker node in ON CLUSTER operations
/// As a result, it is difficult or impossible to run a query with parallel *MergeTree replicas that uses *Cluster functions,
/// because the query is marked as QueryKind::SECONDARY_QUERY but it is not clear for what purpose: parallel replicas or *Cluster functions.
/// In contrast to ON CLUSTER queries, when queries are executed on worker nodes in DDL replication, they are not marked as QueryKind::SECONDARY_QUERY.
/// Related to that case, there was a bug when a query in DDL replication had *Cluster functions; *Cluster functions expect a distributed file iterator which is not set in DDL replication.
/// When a query is marked as QueryKind::SECONDARY_QUERY, parsing and execution of the query is different.
/// Not all optimizations are applied, not all checks are made. The query is considered as already well-prepared and safe to execute.

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fixing exception LOGICAL_ERROR: Next task callback is not set for query

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jul 30, 2025

Workflow [PR], commit [c585f41]

Summary:

job_name test_name status info comment
Stateless tests (amd_binary, old analyzer, s3 storage, DatabaseReplicated, sequential) failure
01154_move_partition_long FAIL
Bugfix validation (functional tests) failure
Stress test (amd_ubsan) failure
Server died FAIL
Hung check failed, possible deadlock found (see hung_check.log) FAIL
Killed by signal (in clickhouse-server.log) FAIL
Fatal message in clickhouse-server.log (see fatal_messages.txt) FAIL
Killed by signal (output files) FAIL
Found signal in gdb.log FAIL

@clickhouse-gh clickhouse-gh bot added the pr-not-for-changelog This PR should not be mentioned in the changelog label Jul 30, 2025
@CheSema CheSema changed the title add a test when create table as select from s3 function LOGICAL_ERROR: Next task callback is not set for query Jul 31, 2025
@CheSema
Copy link
Copy Markdown
Member Author

CheSema commented Jul 31, 2025

It seems to be relevant to the feature #70659

@clickhouse-gh clickhouse-gh bot added pr-bugfix Pull request with bugfix, not backported by default and removed pr-not-for-changelog This PR should not be mentioned in the changelog labels Jul 31, 2025
@thevar1able thevar1able self-assigned this Jul 31, 2025
{
auto query_context = DDLTaskBase::makeQueryContext(from_context, zookeeper);
query_context->setQueryKind(ClientInfo::QueryKind::SECONDARY_QUERY);
// Do not set here SECONDARY_QUERY
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the root cause of the issue.

@CheSema
Copy link
Copy Markdown
Member Author

CheSema commented Aug 1, 2025

Integration tests (asan, old analyzer, 5/6)required
flaky tests test_restore_replica
#84855

@CheSema CheSema marked this pull request as ready for review August 1, 2025 21:08
@devcrafter
Copy link
Copy Markdown
Member

Let's clarify the context of the problem somewhere in the comments, - the conflict on query_kind in context between DDL and DML parts of the query with CREATE AS SELECT. Can't we in general create another context for SELECT part?

@CheSema
Copy link
Copy Markdown
Member Author

CheSema commented Aug 7, 2025

Can't we in general create another context for SELECT part?

I think, yes, we can create different context for different parts of query.

@CheSema CheSema requested a review from devcrafter August 7, 2025 12:30
@CheSema CheSema added the pr-must-backport Pull request should be backported intentionally. Use this label with great care! label Aug 7, 2025
@devcrafter
Copy link
Copy Markdown
Member

Can't we in general create another context for SELECT part?

I think, yes, we can create different context for different parts of query.

Then wouldn't it be easier to create new context for select and use update the query kind there?

@CheSema
Copy link
Copy Markdown
Member Author

CheSema commented Aug 7, 2025

Can't we in general create another context for SELECT part?

I think, yes, we can create different context for different parts of query.

Then wouldn't it be easier to create new context for select and use update the query kind there?

I do not know how. My response was more theoretical. That could be good approach, but I do not know how to implement it. Research is required.

@devcrafter
Copy link
Copy Markdown
Member

Can't we in general create another context for SELECT part?

I think, yes, we can create different context for different parts of query.

Then wouldn't it be easier to create new context for select and use update the query kind there?

I do not know how. My response was more theoretical. That could be good approach, but I do not know how to implement it. Research is required.

I assume this is the place where we can update the context and run insert select with it

return InterpreterInsertQuery(
insert,
getContext(),
getContext()->getSettingsRef()[Setting::insert_allow_materialized_columns],
/* no_squash */ false,
/* no_destination */ false,
/* async_isnert */ false)
.execute();

@CheSema CheSema requested a review from devcrafter August 11, 2025 14:25
@CheSema
Copy link
Copy Markdown
Member Author

CheSema commented Aug 12, 2025

Can't we in general create another context for SELECT part?

I think, yes, we can create different context for different parts of query.

Then wouldn't it be easier to create new context for select and use update the query kind there?

I do not know how. My response was more theoretical. That could be good approach, but I do not know how to implement it. Research is required.

I assume this is the place where we can update the context and run insert select with it

return InterpreterInsertQuery(
insert,
getContext(),
getContext()->getSettingsRef()[Setting::insert_allow_materialized_columns],
/* no_squash */ false,
/* no_destination */ false,
/* async_isnert */ false)
.execute();

It does not work.

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=84753&sha=e10634b9148e2bc1ab809184cd994c4b147e360a&name_0=PR&name_1=Stateless+tests+%28amd_binary%2C+old+analyzer%2C+s3+storage%2C+DatabaseReplicated%2C+parallel%29

at QueryAnalyzer::resolveTableFunction
table function is executed scope_context->getQueryContext()->executeTableFunction
with context from QueryAnalyzer auto & scope_context = scope.context;.
In that context query kind is set as SECONDARY_QUERY.

SCOPE_EXIT({
getContext()->setQueryKind(previous_query_kind);
});
getContext()->setQueryKind(ClientInfo::QueryKind::INITIAL_QUERY);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check CREATE AS SELECT queries (w/o REPLACE). AFAIS, it's different code path and doesn't use Context::setQueryContext()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand what do you mean.

InterpreterCreateQuery::doCreateOrReplaceTable also calls fillTableIfNeeded.

@CheSema
Copy link
Copy Markdown
Member Author

CheSema commented Aug 14, 2025

We got in dead end here.

I think that the version on f8e3e07ec248e8fe01336b0bffc9247fc1c32ae6 commit it the right way to go.
But Igor does not have a strong opinion here and could not approve that version of code.
In other hand I could not agree with the current way. This is a tricky way, if some thing goes wrong, I do not know how it should be transformed further.

The alternative way is forbid to run create as select with *Cluster functions.

@devcrafter
Copy link
Copy Markdown
Member

The alternative way is forbid to run create as select with *Cluster functions.

Let's first, simply prevent replacing s3 with s3cluster in DDL queries. cc @thevar1able

thevar1able added a commit that referenced this pull request Aug 16, 2025
@thevar1able
Copy link
Copy Markdown
Member

Test 03579_create_table_populate_from_s3 is green on master.
Closing in favor of #85734

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default pr-must-backport Pull request should be backported intentionally. Use this label with great care!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants