Skip to content

Revert "Remove test_storage_nats"#92155

Merged
antaljanosbenjamin merged 11 commits intomasterfrom
revert-92140-remove-test-nats
Jan 7, 2026
Merged

Revert "Remove test_storage_nats"#92155
antaljanosbenjamin merged 11 commits intomasterfrom
revert-92140-remove-test-nats

Conversation

@nikitamikhaylov
Copy link
Copy Markdown
Member

@nikitamikhaylov nikitamikhaylov commented Dec 15, 2025

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Dec 15, 2025

Workflow [PR], commit [af6aa05]

Summary:

@clickhouse-gh clickhouse-gh bot added the pr-not-for-changelog This PR should not be mentioned in the changelog label Dec 15, 2025
@antaljanosbenjamin
Copy link
Copy Markdown
Member

A small explanation: based on the extra debug logs that logged with the failed assertion, it is clear that the messages were produced twice (each message is in the view 10 times, while only 5 threads were inserting). Checking the INSERT queries we can see 10 insert queries were run:

2025.12.13 00:08:59.151625 [ 788 ] {a50c584f-1ee2-44ef-b480-7bdc739a29e4} <Debug> executeQuery: (from 172.16.3.1:63204) (query 1, line 1) INSERT INTO test.nats_overload VALUES  (stage: Complete)
2025.12.13 00:08:59.633606 [ 14 ] {dd043ec4-5bdb-4958-8365-b54471266af5} <Debug> executeQuery: (from 172.16.3.1:63218) (query 1, line 1) INSERT INTO test.nats_overload VALUES  (stage: Complete)
2025.12.13 00:09:00.237751 [ 15 ] {a19dcf3c-be04-4e06-8497-c80518245a1c} <Debug> executeQuery: (from 172.16.3.1:63220) (query 1, line 1) INSERT INTO test.nats_overload VALUES  (stage: Complete)
2025.12.13 00:09:00.352751 [ 16 ] {9b015d88-4a50-4a51-b161-6342a723ff80} <Debug> executeQuery: (from 172.16.3.1:63228) (query 1, line 1) INSERT INTO test.nats_overload VALUES  (stage: Complete)
2025.12.13 00:09:01.115297 [ 777 ] {46d54681-6943-461a-a777-79a1d5f38b3f} <Debug> executeQuery: (from 172.16.3.1:63426) (query 1, line 1) INSERT INTO test.nats_overload VALUES  (stage: Complete)
2025.12.13 00:14:01.380516 [ 775 ] {de65ea26-79b0-42a8-af31-a5914a559054} <Debug> executeQuery: (from 172.16.3.1:60700) (query 1, line 1) INSERT INTO test.nats_overload VALUES  (stage: Complete)
2025.12.13 00:14:02.072654 [ 805 ] {45b0dcbc-ed52-40ba-8e6b-bfc7fb3f07d4} <Debug> executeQuery: (from 172.16.3.1:60716) (query 1, line 1) INSERT INTO test.nats_overload VALUES  (stage: Complete)
2025.12.13 00:14:02.594949 [ 808 ] {a6018490-0db6-4574-9637-8877714004a4} <Debug> executeQuery: (from 172.16.3.1:60720) (query 1, line 1) INSERT INTO test.nats_overload VALUES  (stage: Complete)
2025.12.13 00:14:02.760327 [ 809 ] {b9baf69b-2c4b-4caf-b11d-77fca9a55fa7} <Debug> executeQuery: (from 172.16.3.1:60728) (query 1, line 1) INSERT INTO test.nats_overload VALUES  (stage: Complete)
2025.12.13 00:14:03.429156 [ 811 ] {6a61c6a2-2c87-4f55-81f3-383638d4060a} <Debug> executeQuery: (from 172.16.3.1:60754) (query 1, line 1) INSERT INTO test.nats_overload VALUES  (stage: Complete)
2025.12.13 00:14:35.835617 [ 788 ] {a50c584f-1ee2-44ef-b480-7bdc739a29e4} <Error> executeQuery: Code: 210. DB::NetException: I/O error: Broken pipe, while writing to socket (172.16.3.3:9000 -> 172.16.3.1:63204). (NETWORK_ERROR) (version 25.12.1.494) (from 172.16.3.1:63204) (query 1, line 1) (in query: INSERT INTO test.nats_overload VALUES ), Stack trace (when copying this message, always include the lines below):
2025.12.13 00:14:41.899860 [ 14 ] {dd043ec4-5bdb-4958-8365-b54471266af5} <Error> executeQuery: Code: 210. DB::NetException: I/O error: Broken pipe, while writing to socket (172.16.3.3:9000 -> 172.16.3.1:63218). (NETWORK_ERROR) (version 25.12.1.494) (from 172.16.3.1:63218) (query 1, line 1) (in query: INSERT INTO test.nats_overload VALUES ), Stack trace (when copying this message, always include the lines below):
2025.12.13 00:14:44.580705 [ 15 ] {a19dcf3c-be04-4e06-8497-c80518245a1c} <Error> executeQuery: Code: 210. DB::NetException: I/O error: Broken pipe, while writing to socket (172.16.3.3:9000 -> 172.16.3.1:63220). (NETWORK_ERROR) (version 25.12.1.494) (from 172.16.3.1:63220) (query 1, line 1) (in query: INSERT INTO test.nats_overload VALUES ), Stack trace (when copying this message, always include the lines below):
2025.12.13 00:14:44.835810 [ 16 ] {9b015d88-4a50-4a51-b161-6342a723ff80} <Error> executeQuery: Code: 210. DB::NetException: I/O error: Broken pipe, while writing to socket (172.16.3.3:9000 -> 172.16.3.1:63228). (NETWORK_ERROR) (version 25.12.1.494) (from 172.16.3.1:63228) (query 1, line 1) (in query: INSERT INTO test.nats_overload VALUES ), Stack trace (when copying this message, always include the lines below):
2025.12.13 00:14:46.501383 [ 777 ] {46d54681-6943-461a-a777-79a1d5f38b3f} <Error> executeQuery: Code: 210. DB::NetException: I/O error: Broken pipe, while writing to socket (172.16.3.3:9000 -> 172.16.3.1:63426). (NETWORK_ERROR) (version 25.12.1.494) (from 172.16.3.1:63426) (query 1, line 1) (in query: INSERT INTO test.nats_overload VALUES ), Stack trace (when copying this message, always include the lines below):

The issue is they took longer than the default receive_timeout which is 300:

2025.12.13 00:14:35.671746 [ 788 ] {a50c584f-1ee2-44ef-b480-7bdc739a29e4} <Debug> TCPHandler: Processed in 336.520696743 sec.

By increasing receive_timeout we should avoid unnecessary retries.

@antaljanosbenjamin
Copy link
Copy Markdown
Member

@nikitamikhaylov I will rerun the flaky test check, but I think the only problematic test is test_storage_nats/test_nats_jet_stream.py::test_nats_overloaded_insert. If you check CI DB you can see that other tests only failed because:

  1. CI infra issues that caused mass failure
  2. There were some issues with our CI between 2025-11-28 and 2025-12-01 because a lot of integration tests failed massively.

Therefore I think after rerunning the flaky test check once, we should good to go. In my opinion the explanation above is correct, it is about the flaky tests so there should be no issue merging it.

@antaljanosbenjamin
Copy link
Copy Markdown
Member

Restarted the whole CI to run the NATS tests again.

@antaljanosbenjamin antaljanosbenjamin marked this pull request as ready for review January 7, 2026 09:57
@antaljanosbenjamin antaljanosbenjamin added this pull request to the merge queue Jan 7, 2026
Merged via the queue into master with commit 8c18398 Jan 7, 2026
131 checks passed
@antaljanosbenjamin antaljanosbenjamin deleted the revert-92140-remove-test-nats branch January 7, 2026 10:01
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test_storage_nats/test_nats_jet_stream.py::test_nats_overloaded_insert

3 participants