Skip to content

Fix upgrade check to respect no-{sanitizer} tags#95652

Closed
nikitamikhaylov wants to merge 1 commit intomasterfrom
fix-upgrade-check-sanitizer-tag-filtering
Closed

Fix upgrade check to respect no-{sanitizer} tags#95652
nikitamikhaylov wants to merge 1 commit intomasterfrom
fix-upgrade-check-sanitizer-tag-filtering

Conversation

@nikitamikhaylov
Copy link
Copy Markdown
Member

Upgrade check failure example: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=95611&sha=254a0211695d3925b5c924650b098045831546e9&name_0=PR&name_1=Upgrade%20check%20%28amd_msan%29&name_1=Upgrade%20check%20%28amd_msan%29

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Respect the skip tags (no-msan, no-asan) for the Upgrade check with a respective sanitizer. Otherwise, it leads to a false positive failure when the previous version (release build) creates a table that is simply not supported.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

The upgrade check was failing with false positives when the new binary
was built with sanitizers (e.g., MSAN) but the old version created tables
with engines that are not supported in sanitizer builds (e.g., `DeltaLakeLocal`).

The issue was that:
1. Tag filtering (`no-msan`, `no-asan`, etc.) was skipped during stress tests
2. Build flags were detected from the running (old) server, not the new binary

This fix:
- Adds `--extra-build-flags` argument to `clickhouse-test` to pass explicit
  sanitizer flags
- Modifies tag filtering to apply during upgrade checks even though they
  are stress tests
- Detects sanitizers from the new packages in `upgrade_runner.sh` before
  running tests on the old version
- Adds `no-msan` tag to `03784_bad_base_backup.sh` which uses `DeltaLakeLocal`

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jan 30, 2026

Workflow [PR], commit [61b0e4f]

Summary:

job_name test_name status info comment
Stateless tests (amd_binary, old analyzer, s3 storage, DatabaseReplicated, parallel) failure
03820_plain_rewritable_over_another_disk_with_same_path FAIL cidb
01623_constraints_column_swap FAIL cidb
00674_join_on_syntax FAIL cidb
03712_tuple_inside_nullable_implicit_cast_for_in_function_2 FAIL cidb
01338_long_select_and_alter_zookeeper FAIL cidb
00719_parallel_ddl_table FAIL cidb
02292_hash_array_tuples FAIL cidb
02900_add_subtract_interval_with_string_date FAIL cidb
03258_dynamic_in_functions_weak_ptr_exception FAIL cidb
01675_data_type_coroutine_2 FAIL cidb
108 more test cases not shown
Upgrade check (amd_msan) failure
Error message in clickhouse-server.log (see upgrade_error_messages.txt) FAIL cidb

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the upgrade check to properly respect test skip tags (like no-msan, no-asan) when running with sanitizer builds. The upgrade check previously failed when old release versions created tables with engines not supported in sanitizer builds, causing false positive test failures.

Changes:

  • Detects the sanitizer type from new ClickHouse packages before installing old ones
  • Passes the detected sanitizer flags to clickhouse-test via a new --extra-build-flags parameter
  • Updates test filtering logic to apply no-{sanitizer} tags during upgrade checks
  • Adds no-msan tag to a test that uses DeltaLake (unsupported in MSAN builds)

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
tests/docker_scripts/upgrade_runner.sh Adds sanitizer detection logic by extracting and querying new packages, then passes detected sanitizer to test runner
tests/clickhouse-test Adds --extra-build-flags argument and merges these flags into build_flags; updates test filtering to apply no-{sanitizer} tags during upgrade checks
tests/queries/0_stateless/03784_bad_base_backup.sh Adds no-msan tag with comment explaining DeltaLakeLocal is unsupported in MSAN builds

@ClickHouse ClickHouse deleted a comment from Copilot AI Jan 30, 2026
@azat azat self-assigned this Feb 1, 2026
@azat azat added the 🍃 green ci 🌿 Fixing flaky tests in CI label Feb 1, 2026
# Detect sanitizer from new packages before installing old ones.
# This is needed to skip tests with no-{sanitizer} tags during upgrade check,
# because tests may create tables with engines that are not supported in sanitizer builds.
echo "Detecting sanitizer from new packages..."
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is likely not fast, we can detect it from the check name

@@ -1,4 +1,6 @@
#!/usr/bin/env bash
# Tags: no-msan
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did it pass the CI initially?

# because tests may create tables with engines that are not supported in sanitizer builds.
echo "Detecting sanitizer from new packages..."
NEW_BINARY_SANITIZER=""
dpkg -x $PACKAGES_DIR/clickhouse-common-static_*.deb /tmp/new_clickhouse_extract
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also there is a better way to extract one binary, I remember I had this oneliner - ar p *.deb data.tar.gz | tar -xzf- ./usr/bin/clickhouse, and you can even strip this prefix

@fm4v
Copy link
Copy Markdown
Member

fm4v commented Feb 2, 2026

This seems like a real upgrade issue, but I can't reproduce it locally yet.

2026.01.30 13:55:13.297916 [ 1607071 ] {} <Error> void DB::AsyncLoader::worker(Pool &): Code: 56. DB::Exception: Unknown table engine DeltaLakeLocal: Cannot attach table `d2_test_amgrptno`.`t9` from metadata file store/e14/e147ae76-bada-439b-9c4b-d3fa98a22b18/t9.sql from query ATTACH TABLE d2_test_amgrptno.t9 UUID \'958afc37-128b-4a8b-85df-a7544f1bd679\' (`c0` String COMMENT \'😉\', `c1` String, `c2` Nullable(UInt256), `c3` Nullable(JSON(max_dynamic_types = 21))) ENGINE = DeltaLakeLocal(\'/var/lib/clickhouse/user_files/03784_bad_base_backup_test_amgrptno_45\', \'Parquet\') SETTINGS iceberg_recent_metadata_file_by_last_updated_ms_field = 0. (UNKNOWN_STORAGE), Stack trace (when copying this message, always include the lines below):

I don't understand why sanitizers are related since the test don't have such tag. We should use a debug build for this check instead if it's actually related. The goal of upgrade check is to find upgrade bugs, not memory errors

@azat
Copy link
Copy Markdown
Member

azat commented Feb 2, 2026

BTW I have a general question, why do we even need upgrade check with MSan, we have stress tests for this. Let's simply leave only upgrade check with release?

@azat
Copy link
Copy Markdown
Member

azat commented Feb 3, 2026

Something like this #95884

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🍃 green ci 🌿 Fixing flaky tests in CI pr-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants