Improve waiting for specific line in logs in integration tests (fixes some flakiness on CI)#86030
Merged
azat merged 1 commit intoClickHouse:masterfrom Aug 23, 2025
Merged
Conversation
Recently CI found the failure of test_postgresql_replica_database_engine
[1], but, according to logs:
- the grep executed **before** the log entry appeared in logs
- the log entry appeared **before** timeout expired
<details>
<summary>logs</summary>
```
2025-08-21 16:45:14.632000 [ 683 ] DEBUG : Command:[docker exec roottestpostgresqlreplicadatabaseengine0-gw0-instance-1 bash -c timeout 30 tail -Fn100 "/var/log/clickhouse-server/clickhouse-server.log" | grep -Em 1 'Table postgresql_replica_2 is skipped from replication stream'] (cluster.py:159, run_and_check)
2025-08-21 16:45:44.679000 [ 683 ] DEBUG : Exitcode:1 (cluster.py:187, run_and_check)
```
```
2025.08.21 16:45:14.638793 [ 962 ] {BgSchPool::8d13df61-833b-4141-83e2-0e8e1c30d1c9} <Information> PostgreSQLReplicaConsumer(postgres_database): Table structure of the table postgresql_replica_2 changed (column value2 was not found), will mark it as skipped from replication. Please perform manual DETACH and ATTACH of the table to bring it back
2025.08.21 16:45:14.638813 [ 962 ] {BgSchPool::8d13df61-833b-4141-83e2-0e8e1c30d1c9} <Warning> PostgreSQLReplicaConsumer(postgres_database): Table postgresql_replica_2 is skipped from replication stream because its structure has changes. Please detach this table and reattach to resume the replication (relation id: 16583)
```
</details>
But the test was not succeeded in locating this line.
[1]: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=85765&sha=d9c83add6087386351df45010de71b027af87a76&name_0=PR&name_1=Integration%20tests%20%28amd_binary%2C%204%2F5%29
I don't see any other reason except for line buffer for this to happen,
so let's disable it with `stdbuf`.
Also I've added some introspection with `tee /dev/stderr`, so that now
we will see all entries that the `grep` received.
Contributor
bharatnc
approved these changes
Aug 23, 2025
azat
added a commit
to azat/ClickHouse
that referenced
this pull request
Sep 26, 2025
… way) CI found failuer again [1]: [1]: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=87584&sha=97e95fa30f48eb6414c18ffc59e2c70c38a437d1&name_0=PR&name_1=Integration%20tests%20%28amd_binary%2C%204%2F5%29 And after added ClickHouse#86030, I can see that logs contains rows starting from: 2025.09.26 16:33:49.268044 [ 965 ] {} <Trace> test_database.postgresql_replica_4 (938f7e82-5e0e-4f88-ad88-a5ac82b289c3): Trying to reserve 1.00 MiB using storage policy from min volume index 0 While the haystack was slightly earlier: 2025.09.26 16:33:49.267221 [ 965 ] {BgSchPool::2fe592b6-5cb1-45f2-becc-27ceea4a4e98} <Warning> PostgreSQLReplicaConsumer(postgres_database): Table postgresql_replica_1 is skipped from replication stream because its structure has changes. Please detach this table and reattach to resume the replication (relation id: 16619) This is a typical problem with test logging, let's simply increase look_behind_lines, but not in the test, but globally, to avoid any flakiness further, 100 or 10000 is not a big deal for `grep -F`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Recently CI found the failure of test_postgresql_replica_database_engine 1, but, according to logs:
logs
But the test was not succeeded in locating this line.
I don't see any other reason except for line buffer for this to happen, so let's disable it with
stdbuf.Also I've added some introspection with
tee /dev/stderr, so that now we will see all entries that thegrepreceived.Changelog category (leave one):
Fixes: #86008