Skip to content

no-msan in 00804_test_alter_compression_codecs#83421

Merged
alexey-milovidov merged 1 commit intomasterfrom
omnomsan
Jul 15, 2025
Merged

no-msan in 00804_test_alter_compression_codecs#83421
alexey-milovidov merged 1 commit intomasterfrom
omnomsan

Conversation

@al13n321
Copy link
Copy Markdown
Member

@al13n321 al13n321 commented Jul 8, 2025

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Saw it fail twice with msan on seemingly unrelated PRs:

It normally takes anywhere between 50s and 270s: https://play.clickhouse.com/play?user=play#U0VMRUNUIGNoZWNrX3N0YXJ0X3RpbWUsIGhlYWRfcmVmLCB0ZXN0X2R1cmF0aW9uX21zLCB0ZXN0X3N0YXR1cywgY2hlY2tfc3RhdHVzLCByZXBvcnRfdXJsCkZST00gY2hlY2tzCldIRVJFIDEKICAgIEFORCBjaGVja19zdGFydF90aW1lID49IG5vdygpIC0gSU5URVJWQUwgMjQwIEhPVVIKICAgIEFORCB0ZXN0X3N0YXR1cyAhPSAnU0tJUFBFRCcKICAgIC0tQU5EICh0ZXN0X3N0YXR1cyBMSUtFICdGJScgT1IgdGVzdF9zdGF0dXMgTElLRSAnRSUnKSAKICAgIC0tQU5EIGNoZWNrX3N0YXR1cyAhPSAnc3VjY2VzcycKICAgIGFuZCBjaGVja19uYW1lID0gJ1N0YXRlbGVzcyB0ZXN0cyAoYW1kX21zYW4sIDEvNCknCiAgICBhbmQgdGVzdF9uYW1lID0gJzAwODA0X3Rlc3RfYWx0ZXJfY29tcHJlc3Npb25fY29kZWNzJwpPUkRFUiBCWSBjaGVja19uYW1lLCB0ZXN0X25hbWUsIGNoZWNrX3N0YXJ0X3RpbWU=

But occasionally times out after something like 350s.

The slow part is a simple MergeTree insert with 300k rows:

INSERT INTO large_alter_table_00804 SELECT toDate('2019-01-01'), number, toString(number + rand()) FROM system.numbers LIMIT 300000;

In the failed runs the query completes, seemingly normally, server log doesn't have anything interesting for this query.

Idk why it's so slow and inconsistent in MSAN build. This PR just disabled this test in msan. There may or may not be more to investigate here; maybe the server was overloaded by too many tests in parallel and we need to decrease parallelism for asan tests? or maybe the server was overloaded by one particular bad test running in parallel with this one? or there's some performance bug that gets much worse under msan? I didn't investigate.

@al13n321 al13n321 added the 🍃 green ci 🌿 Fixing flaky tests in CI label Jul 8, 2025
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jul 8, 2025

Workflow [PR], commit [4878fde2]

Summary:

job_name test_name status info comment
Stateless tests (amd_binary) failure
00133_long_shard_memory_tracker_and_exception_safety FAIL
Stress test (amd_msan) failure
Server died FAIL
Hung check failed, possible deadlock found (see hung_check.log) FAIL
Cannot start clickhouse-server FAIL
Server failed to start (see application_errors.txt and clickhouse-server.clean.log) FAIL
Killed by signal (in clickhouse-server.log) FAIL
Fatal message in clickhouse-server.log (see fatal_messages.txt) FAIL
Killed by signal (output files) FAIL
Found signal in gdb.log FAIL

@alexey-milovidov alexey-milovidov merged commit 0f71659 into master Jul 15, 2025
117 of 123 checks passed
@alexey-milovidov alexey-milovidov deleted the omnomsan branch July 15, 2025 00:54
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🍃 green ci 🌿 Fixing flaky tests in CI pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants