Hopelessly fix test_userspace_page_cache flakiness#93653
Hopelessly fix test_userspace_page_cache flakiness#93653alexey-milovidov merged 12 commits intomasterfrom
Conversation
|
|
I don't understand the output in CI. https://s3.amazonaws.com/clickhouse-test-reports/PRs/93653/1322a027cf43343fdaec5765054d63bf127fd9e4//integration_tests_amd_asan_targeted/job.log says Why skipped? What causes the error? On my machine @maxknv does this look familiar by any chance? |
skipped because of this This job basically just starts the cluster at the test suite setup and stops it on the teardown. Stopping cluster hangs for some reason. So failure is not exactly in the test case no problem in server logs? You can try running the job locally: pyhhon -m ci.praktika run 'Integration tests (amd_asan, targeted' --test test_userspace_page_cache |
|
@al13n321, is there some hope? |
Don't create `node_smol` container in sanitizer builds at all. Previously the test was skipped inside the test body, but the container was still started and had to be shut down. Under ASan the leak checker at exit is very slow with high `max_server_memory_usage`, causing `docker compose stop` to hang. Detect sanitizer builds early using `clickhouse local` (no server needed) and conditionally skip adding the instance before `cluster.start`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The `_is_sanitizer_build` function used non-existent `hasAddressSanitizer()` function, causing `clickhouse local` to fail and always return False. This meant sanitizer builds were never detected, so `node_smol` was always created and `test_size_adjustment` was never skipped in ASan/TSan runs. Fix by querying `system.build_options` for `CXX_FLAGS` containing `-fsanitize=`, which is the same approach used by the integration test framework's `is_built_with_sanitizer` method. Also add proper cleanup: drop table at test start (to handle previous failed runs in the flaky runner) and use try/finally for cleanup to prevent cascading "Table already exists" failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…hing type When `FunctionVariantAdaptor` executes a function on a Variant column that contains a single variant (e.g., `Array(Nothing)` from an empty array literal `[]`), the nested function can return a result of type `Nothing`. The code only checked for `Nullable(Nothing)` via `onlyNull` but missed plain `Nothing`, causing a failed cast to the expected `Variant(...)` result type and a `LOGICAL_ERROR` exception. Add `isNothing` checks alongside existing `onlyNull` checks in all three execution paths (single variant no NULLs, single variant with NULLs, multiple variants) to treat `Nothing` results as defaults/NULLs. https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=42be5daa2cfd617b45ee36eeec6d72fd405fba41&name_0=MasterCI&name_1=AST%20fuzzer%20%28amd_debug%29 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rom test The setting is obsolete (always true). The test already has `SET enable_analyzer = 1` which is needed for Variant type inference in UNION ALL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Changelog category (leave one):
Closes #92761