Skip to content

Fix flaky test 02377_extend_protocol_with_query_parameters#101437

Open
groeneai wants to merge 1 commit intoClickHouse:masterfrom
groeneai:fix/flaky-02377-map-key-order
Open

Fix flaky test 02377_extend_protocol_with_query_parameters#101437
groeneai wants to merge 1 commit intoClickHouse:masterfrom
groeneai:fix/flaky-02377-map-key-order

Conversation

@groeneai
Copy link
Copy Markdown
Contributor

When CI randomizes map_serialization_version and/or map_serialization_version_for_zero_level_parts to with_buckets, Map keys get distributed into hash buckets which changes their iteration order on readback. The test inserts data into a MergeTree table with Map columns and expects a specific key order in the output — but bucketed serialization doesn't preserve insertion order.

This caused 10 failures across 7 unrelated PRs in the last 30 days (amd_tsan, amd_asan_ubsan, amd_msan, amd_debug, arm_binary, amd_llvm_coverage).

The fix pins map_serialization_version = 'basic' and map_serialization_version_for_zero_level_parts = 'basic' in the test's CREATE TABLE SETTINGS. This is a targeted pin — the test validates query parameter passing with complex types, not Map serialization behavior.

Verified: 50/50 passes with full randomization.

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

...

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Pin map_serialization_version and map_serialization_version_for_zero_level_parts
to 'basic' in the test's CREATE TABLE SETTINGS. When CI randomizes these to
'with_buckets', Map keys get distributed into hash buckets which changes their
iteration order on readback, causing non-deterministic output. This test validates
query parameter passing, not Map serialization — key order is irrelevant.

10 failures across 7 PRs in 30 days (amd_tsan, amd_asan_ubsan, amd_msan,
amd_debug, arm_binary, amd_llvm_coverage).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@groeneai
Copy link
Copy Markdown
Contributor Author

Pre-PR Validation Gate

Session: cron:clickhouse-ci-task-worker:20260331-221500

a) Deterministic repro?

clickhouse-client --allow_merge_tree_settings \
  --map_serialization_version with_buckets \
  --map_serialization_version_for_zero_level_parts with_buckets \
  --max_buckets_in_map 4 --map_buckets_strategy constant --map_buckets_min_avg_size 1 \
  -q "CREATE TABLE t (id Int64, map_arr Map(UInt8, Array(UInt8))) ENGINE=MergeTree ORDER BY id;
      INSERT INTO t VALUES (42, {10:[11,12], 13:[14,15]});
      SELECT * FROM t;  -- outputs {13:[14,15],10:[11,12]} instead of {10:[11,12],13:[14,15]}
      DROP TABLE t;"

b) Root cause explained?
When map_serialization_version=with_buckets, Map keys are distributed into hash buckets. The bucket hash of UInt8 key 13 maps to a lower bucket than key 10 (with 4 buckets), so keys are read back in bucket order rather than insertion order. Similarly for String keys with 31 buckets. The test expects insertion-order output but bucketed serialization doesn't preserve it.

c) Fix matches root cause?
Pinning map_serialization_version='basic' at the table level ensures keys are stored in insertion order, directly addressing the bucketed serialization root cause.

d) Test intent preserved?
The test validates query parameter parsing ({name:Type}) with complex types. Map key ordering is orthogonal — it's a data layout concern, not a parameter passing concern.

e) Both directions demonstrated?
Without fix + bucketed settings: {13:[14,15],10:[11,12]} (FAIL). With fix: {10:[11,12],13:[14,15]} (PASS). 50/50 passes under full randomization.

@groeneai
Copy link
Copy Markdown
Contributor Author

cc @alexey-milovidov — could you review this? Test-only fix pinning map serialization settings to prevent non-deterministic Map key ordering in the query parameters test.

@azat azat added the can be tested Allows running workflows for external contributors label Apr 1, 2026
@azat azat self-assigned this Apr 1, 2026
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Apr 1, 2026

Workflow [PR], commit [551b63d]

Summary:

job_name test_name status info comment
Stress test (arm_release) failure
Server died FAIL cidb
Logical error: Shard number is greater than shard count: shard_num=A shard_count=B cluster=C (STID: 5066-457d) FAIL cidb
Stress test (arm_msan) failure
Server died FAIL cidb
MemorySanitizer: use-of-uninitialized-value (STID: 1003-358c) FAIL cidb, issue

AI Review

Summary

This PR fixes flakiness in 02377_extend_protocol_with_query_parameters by pinning map_serialization_version and map_serialization_version_for_zero_level_parts to basic in the test table settings so Map key iteration order remains deterministic. The change is narrowly scoped to test setup, aligns with the stated motivation, and does not introduce correctness, safety, or performance risks in production code.

ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh bot added the pr-ci label Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants