Fix inconsistent AST formatting for EXPLAIN AST INSERT with trailing output options#100174
Fix inconsistent AST formatting for EXPLAIN AST INSERT with trailing output options#100174alexey-milovidov merged 4 commits intomasterfrom
EXPLAIN AST INSERT with trailing output options#100174Conversation
…g output options When `EXPLAIN AST INSERT INTO ... SELECT ... FORMAT Null` is parsed, the `INTO OUTFILE`/`FORMAT`/`SETTINGS` tokens end up on the EXPLAIN query (not on the INSERT's SELECT), because `ParserInsertQuery` does not consume them. During formatting, `ASTExplainQuery::formatQueryImpl` was wrapping the INSERT in parentheses (because INSERT is not an `ASTQueryWithOutput`), producing `EXPLAIN AST (INSERT INTO ...)` which cannot be parsed back — `(INSERT ...)` is not valid syntax. INSERT queries don't need wrapping because the INSERT parser does not consume FORMAT/SETTINGS/INTO OUTFILE at its boundary, so these tokens remain safely on the EXPLAIN during re-parsing. Closes #100131 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Workflow [PR], commit [1655b5e] Summary: ✅ AI ReviewSummaryThis PR fixes ClickHouse Rules
Final Verdict
|
| -- on the EXPLAIN level formats correctly without wrapping INSERT in unparseable parens. | ||
| -- https://github.com/ClickHouse/ClickHouse/issues/100131 | ||
|
|
||
| SELECT formatQuerySingleLine('EXPLAIN AST INSERT INTO t1 SELECT 1 FORMAT Null'); |
There was a problem hiding this comment.
FORMAT case, but the change description and code comment also claim safety for trailing SETTINGS and INTO OUTFILE on EXPLAIN.
Please add at least one formatQuerySingleLine case for ... SETTINGS ... and one for ... INTO OUTFILE ... so we lock in the full intended behavior and avoid partial regression coverage.
src/Parsers/ASTExplainQuery.h
Outdated
| @@ -1,5 +1,6 @@ | |||
| #pragma once | |||
|
|
|||
| #include <Parsers/ASTInsertQuery.h> | |||
There was a problem hiding this comment.
#include <Parsers/ASTInsertQuery.h> to a header and uses dynamic_cast only to detect INSERT.
Because ASTInsertQuery.h pulls heavier transitive dependencies (StorageID, buffers, etc.), this can increase rebuild cost across all TUs including ASTExplainQuery.h.
Please prefer a lightweight check that avoids including ASTInsertQuery.h in this header, e.g. branch on query->getQueryKind() == QueryKind::Insert (and QueryKind::AsyncInsertFlush if relevant), then keep ASTExplainQuery.h free of heavy includes.
…RT detection, add test coverage for `INTO OUTFILE` and `SETTINGS` Replace `dynamic_cast<const ASTInsertQuery *>` with `getQueryKind` checks for `QueryKind::Insert` and `QueryKind::AsyncInsertFlush`, removing the heavyweight `#include <Parsers/ASTInsertQuery.h>` from `ASTExplainQuery.h`. Add `formatQuerySingleLine` test cases for `EXPLAIN AST INSERT ... INTO OUTFILE` and `EXPLAIN AST INSERT ... SETTINGS` to cover all trailing output option variants. #100174 (comment) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…XPLAIN AST INSERT` The INSERT parser consumes `SETTINGS` (and pushes it down to the inner SELECT), and `INTO OUTFILE` formatting escapes quotes differently. These test cases were testing pre-existing behavior unrelated to the original fix for `FORMAT`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LLVM Coverage Report
PR changed lines: PR changed-lines coverage: 100.00% (7/7, 0 noise lines excluded) |
…overage [iter 8]
The dynamic score floor `effective_min = min(MAX_EFFECTIVE_MIN, top_score/1000)`
was previously capped at 1e-4. This filtered out tests that cover changed `.h`
files via header expansion — specifically tests that cover UNCHANGED lines in
the same header (non-overlapping regions, assigned SIBLING_DIR_WIDTH=10000).
For such tests: score = PASS_WEIGHT_DIRECT / (SIBLING_DIR_WIDTH × rc)
= 1.0 / (10000 × rc)
At rc=7: score = 1.4e-5 (above 1e-5, below old 1e-4 → was filtered!)
At rc=17: score = 5.9e-6 (still filtered, but adds to the total width_score)
Example: PR #100174 changes `ASTExplainQuery.h` at line 130 (hunk 127–132).
The file has no CIDB coverage at the changed lines, but 20 tests cover the file
at lines 35 (rc=7), 125 (rc=17), 119 (rc=17) etc. Their total width_score
≈ 2.6e-5 was just below the old 1e-4 floor and got silently filtered.
With MAX_EFFECTIVE_MIN=1e-5, these tests pass the filter.
The old cap comment said "top_score above 1/10 treated as 1/10"; now it reads
"top_score above 1/100 treated as 1/100". This is still a conservative cap:
even at top_score=1.0 (rc=1 direct hit), effective_min is only 1e-5, which is
100× above MIN_SCORE=1e-8 and safely filters out true noise.
Evaluation: 70.7% → 74.7% recall (+4.0pp) on 20-PR ground-truth set.
PR 100174 (ASTExplainQuery.h, 20 tests): 25% → 80%.
No regressions across all 20 eval PRs.
When
EXPLAIN AST INSERT INTO ... SELECT ... FORMAT Nullis parsed, theINTO OUTFILE/FORMAT/SETTINGStokens end up on the EXPLAIN query (not on the INSERT's SELECT), becauseParserInsertQuerydoes not consume them. During formatting,ASTExplainQuery::formatQueryImplwas wrapping the INSERT in parentheses (because INSERT is not anASTQueryWithOutput), producingEXPLAIN AST (INSERT INTO ...)which cannot be parsed back —(INSERT ...)is not valid syntax.INSERT queries don't need wrapping because the INSERT parser does not consume FORMAT/SETTINGS/INTO OUTFILE at its boundary, so these tokens remain safely on the EXPLAIN during re-parsing.
Closes #100131
Changelog category (leave one):