Conversation
Bring the existing `QueryFuzzer` (used in clickhouse-client) to the server. Two new experimental settings control it: - `ast_fuzzer_runs` (Float, default 0): 0 disables, 0<v<1 is probability, >=1 is the number of fuzzed queries per normal query. - `ast_fuzzer_any_query` (Bool, default false): when false only read-only queries are fuzzed; when true all query types are fuzzed. A single global `QueryFuzzer` instance accumulates AST fragments from all queries across all sessions, producing increasingly interesting mutations over time. Fuzzed queries are executed internally with results discarded; failures are logged at TRACE level and fed back via `notifyQueryFailed`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a non-deterministic SQL function `fuzzQuery` that parses a query string, applies random AST mutations via the global `QueryFuzzer`, and returns the fuzzed query as a string. Guarded by `allow_fuzz_query_functions` setting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add `.introduced_in = {26, 2}` to `fuzzQuery` function documentation
to fix the `02415_all_new_functions_must_have_version_information` test.
- Add `allow_fuzz_query_functions` to `enableAllExperimentalSettings.cpp`
to fix the style check.
#97568
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…zzed queries Fuzzed queries are expected to fail with errors. Set `send_logs_level = 'fatal'` to prevent those `<Error>` messages from appearing in stderr, which causes the flaky check to report "having stderror". https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=97568&sha=f7b86ffd9e6d3cca1f8d3b1beba038f13aa119af&name_0=PR&name_1=Stateless%20tests%20%28amd_tsan%2C%20flaky%20check%29 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Add ProfileEvent for the number of fuzzed queries attempted and CurrentMetric on the current size of the fuzzer's accumulated state. |
- Fix zero byte handling in `fuzzQuery`: exclude trailing zero bytes from `ColumnString` data when computing the `end` pointer for the parser, and remove unused `single_offset` variable. - Remove `ast_fuzzer_any_query` test case that could replace `SELECT` with other statements and break other tests. - Generalize Bernoulli distribution for fractional `ast_fuzzer_runs` values to work with both integer and fractional parts. - Add `ASTFuzzerAccumulatedFragments` metric and `ASTFuzzerQueries` profile event for observability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The review comment about zero bytes referred to the output buffer management, not the input parsing. The `end` pointer should include the full range from `ColumnString`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…r::fuzzMain` Instead of setting the metric at each call site after calling `fuzzMain`, set it once inside the fuzzer itself. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
af59287 to
9785d6e
Compare
|
This breaks stress tests - https://pastila.nl/?0028818a/843ace35c256b494621eda86a3adbb60#dqCrkutAiLssIPA7vVG1HA==GCM |
|
Fix - #97835 |
- Add `.introduced_in = {26, 2}` to `fuzzQuery` function documentation
to fix the `02415_all_new_functions_must_have_version_information` test.
- Add `allow_fuzz_query_functions` to `enableAllExperimentalSettings.cpp`
to fix the style check.
ClickHouse#97568
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix zero byte handling in `fuzzQuery`: exclude trailing zero bytes from `ColumnString` data when computing the `end` pointer for the parser, and remove unused `single_offset` variable. - Remove `ast_fuzzer_any_query` test case that could replace `SELECT` with other statements and break other tests. - Generalize Bernoulli distribution for fractional `ast_fuzzer_runs` values to work with both integer and fractional parts. - Add `ASTFuzzerAccumulatedFragments` metric and `ASTFuzzerQueries` profile event for observability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…uzzer Add server-side AST fuzzer
| std::pair<std::shared_ptr<QueryFuzzer>, std::unique_lock<std::mutex>> getGlobalASTFuzzer() | ||
| { | ||
| static std::mutex mutex; | ||
| static std::shared_ptr<QueryFuzzer> fuzzer = std::make_shared<QueryFuzzer>(randomSeed()); |
There was a problem hiding this comment.
Is main reason to have global QueryFuzzer (instead of one per FunctionFuzzQuery for example) that it works better the more queries fed into it? Maybe se still could store it in global context?
There was a problem hiding this comment.
The original motivation was to make sure that queries from .sh tests are also used for fuzzing.
But now the main motivation - combining it with Stress test and BuzzHouse. It appears to be super powerful: #98138
The goal is: if some bug can be found - increase the probability of finding it :)
Closes #28107.
Motivation: We can enable it in Stress tests.
Also, we can combine BuzzHouse with ASTFuzzer.
Summary
QueryFuzzer(from clickhouse-client) to the server sideast_fuzzer_runs(number/probability of fuzzed queries per normal query) andast_fuzzer_any_query(whether to fuzz DDL/INSERT or only read-only queries)QueryFuzzerinstance accumulates AST fragments across all sessions, producing increasingly interesting mutations over timeTest plan
SET ast_fuzzer_runs = 3; SELECT 1produces fuzzed queries visible in server TRACE logsSET ast_fuzzer_any_query = 103833_server_ast_fuzzerpassesChangelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Add server-side AST fuzzer controlled by
ast_fuzzer_runsandast_fuzzer_any_querysettings. When enabled, the server runs randomized mutations of each query after its normal execution, discarding the results.Documentation entry for user-facing changes
New experimental settings:
ast_fuzzer_runs(Float, default 0): Controls the server-side AST fuzzer. 0 = disabled, 0 < value < 1 = probability of one fuzzed run, >= 1 = number of fuzzed runs per query.ast_fuzzer_any_query(Bool, default false): When false, only read-only queries (SELECT, EXPLAIN, SHOW, DESCRIBE, EXISTS) are fuzzed. When true, all query types are fuzzed.Example:
SET ast_fuzzer_runs = 3; SELECT number FROM numbers(10) WHERE number > 5;🤖 Generated with Claude Code