Skip to content

Commit fbbec70

Browse files
committed
Merge remote-tracking branch 'upstream/master' into dot-issue
2 parents c46efbb + 06d048d commit fbbec70

437 files changed

Lines changed: 21015 additions & 11893 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/instructions.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# ClickHouse Development Instructions
2+
3+
## Running Stateless Tests
4+
5+
Stateless tests are located in `tests/queries/0_stateless/`.
6+
7+
### Prerequisites
8+
1. Build ClickHouse: `cd build && ninja clickhouse-server`
9+
2. Start the server: `./build/programs/clickhouse server --config-file ./programs/server/config.xml`
10+
3. Wait for server to be ready: `./build/programs/clickhouse client -q "SELECT 1"`
11+
12+
### Running Tests
13+
Run tests with the correct port environment variables (default config uses TCP=9000, HTTP=8123):
14+
15+
```bash
16+
CLICKHOUSE_PORT_TCP=9000 CLICKHOUSE_PORT_HTTP=8123 ./tests/clickhouse-test <test_name>
17+
```
18+
19+
### Useful Flags
20+
- `--no-random-settings` - Disable settings randomization (useful for deterministic debugging)
21+
- `--no-random-merge-tree-settings` - Disable MergeTree settings randomization
22+
- `--record` - Automatically update `.reference` files when stdout differs
23+
24+
### Test File Extensions
25+
- `.sql` - SQL test (most common)
26+
- `.sql.j2` - Jinja2-templated SQL test
27+
- `.sh` - Shell script test
28+
- `.py` - Python test
29+
- `.expect` - Expect script test
30+
- `.reference` - Expected output (compared against stdout)
31+
- `.gen.reference` - Generated reference for `.j2` tests
32+
33+
### Database Name Normalization
34+
The test runner creates a temporary database with a random name (e.g., `test_abc123`) for each test.
35+
After test execution, the random database name is replaced with `default` in stdout/stderr files before comparison with `.reference`.
36+
This means `.reference` files should use `default` for database names, NOT `${CLICKHOUSE_DATABASE}` or the actual random name.
37+
38+
### Test Tags
39+
Tests can have tags in the first line as a comment: `-- Tags: no-fasttest, no-parallel`
40+
Common tags: `disabled`, `no-fasttest`, `no-parallel`, `no-random-settings`, `no-random-merge-tree-settings`, `long`
41+
42+
### Random Settings Limits
43+
Tests can specify limits for randomized settings: `-- Random settings limits: max_threads=(1, 4); ...`
44+
45+
### Stopping the Server
46+
Find and kill the server process:
47+
```bash
48+
pgrep -f "clickhouse server" # Get PIDs
49+
kill <pid1> <pid2> # Stop processes
50+
```

.github/copilot-instructions.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -211,3 +211,49 @@ STYLE & CONDUCT
211211
- Avoid changing scope: review what’s in the PR; suggest follow-ups separately.
212212
- If you are not reasonably confident a finding is a real issue or meaningful risk, **do not mention it**.
213213
- When performing a code review, **ignore `/.github/workflows/*` files**.
214+
215+
RUNNING STATELESS TESTS
216+
217+
Stateless tests are located in `tests/queries/0_stateless/`.
218+
219+
**Prerequisites:**
220+
1. Build ClickHouse: `cd build && ninja clickhouse-server`
221+
2. Start the server: `./build/programs/clickhouse server --config-file ./programs/server/config.xml`
222+
3. Wait for server to be ready: `./build/programs/clickhouse client -q "SELECT 1"`
223+
224+
**Running tests** (default config uses TCP=9000, HTTP=8123):
225+
```bash
226+
CLICKHOUSE_PORT_TCP=9000 CLICKHOUSE_PORT_HTTP=8123 ./tests/clickhouse-test <test_name>
227+
```
228+
229+
**Useful flags:**
230+
- `--no-random-settings` - Disable settings randomization (useful for deterministic debugging)
231+
- `--no-random-merge-tree-settings` - Disable MergeTree settings randomization
232+
- `--record` - Automatically update `.reference` files when stdout differs
233+
234+
**Test file extensions:**
235+
- `.sql` - SQL test (most common)
236+
- `.sql.j2` - Jinja2-templated SQL test
237+
- `.sh` - Shell script test
238+
- `.py` - Python test
239+
- `.expect` - Expect script test
240+
- `.reference` - Expected output (compared against stdout)
241+
- `.gen.reference` - Generated reference for `.j2` tests
242+
243+
**Database name normalization:**
244+
The test runner creates a temporary database with a random name (e.g., `test_abc123`) for each test.
245+
After test execution, the random database name is replaced with `default` in stdout/stderr files before comparison with `.reference`.
246+
This means `.reference` files should use `default` for database names, NOT `${CLICKHOUSE_DATABASE}` or the actual random name.
247+
248+
**Test tags:**
249+
Tests can have tags in the first line as a comment: `-- Tags: no-fasttest, no-parallel`
250+
Common tags: `disabled`, `no-fasttest`, `no-parallel`, `no-random-settings`, `no-random-merge-tree-settings`, `long`
251+
252+
**Random settings limits:**
253+
Tests can specify limits for randomized settings: `-- Random settings limits: max_threads=(1, 4); ...`
254+
255+
**Stopping the server:**
256+
```bash
257+
pgrep -f "clickhouse server" # Get PIDs
258+
kill <pid1> <pid2> # Stop processes
259+
```

base/poco/Net/src/TCPServerDispatcher.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -111,8 +111,8 @@ void TCPServerDispatcher::run()
111111
if (!_stopped)
112112
{
113113
std::unique_ptr<TCPServerConnection> pConnection(_pConnectionFactory->createConnection(pCNf->socket()));
114-
poco_check_ptr(pConnection.get());
115-
pConnection->start();
114+
if (pConnection)
115+
pConnection->start();
116116
}
117117
/// endConnection() should be called after destroying TCPServerConnection,
118118
/// otherwise currentConnections() could become zero while some connections are yet still alive.

ci/jobs/functional_tests.py

Lines changed: 30 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -284,7 +284,9 @@ def main():
284284
stages.remove(JobStages.COLLECT_COVERAGE)
285285
else:
286286
stages.remove(JobStages.COLLECT_LOGS)
287-
if is_coverage or info.is_local_run:
287+
if is_coverage or info.is_local_run or is_bugfix_validation:
288+
# For bugfix validation, we intentionally skip the check error stage (checks FATAL messages):
289+
# regular test failures are assumed to be sufficient to validate the test
288290
stages.remove(JobStages.CHECK_ERRORS)
289291
if info.is_local_run:
290292
if JobStages.COLLECT_LOGS in stages:
@@ -542,7 +544,11 @@ def start():
542544

543545
if JobStages.RETRIES in stages and test_result and test_result.is_failure():
544546
# retry all failed tests and mark original failed either as success on retry or failed on retry
545-
failed_tests = [t.name for t in test_result.results if t.is_failure()]
547+
failed_tests = [
548+
t.name
549+
for t in test_result.results
550+
if t.is_failure() and t.name and t.name[0].isdigit()
551+
]
546552
if len(failed_tests) > 10:
547553
results.append(
548554
Result(
@@ -620,24 +626,28 @@ def start():
620626
test_result.extend_sub_results(results[-1].results)
621627
results[-1].results = []
622628

623-
# invert result status for bugfix validation
624-
if is_bugfix_validation:
625-
has_failure = False
626-
for r in results[-1].results:
627-
r.set_label("xfail")
628-
if r.status == Result.StatusExtended.FAIL:
629-
r.status = Result.StatusExtended.OK
630-
has_failure = True
631-
elif r.status == Result.StatusExtended.OK:
632-
r.status = Result.StatusExtended.FAIL
633-
if not has_failure:
634-
print("Failed to reproduce the bug")
635-
results[-1].set_failed().set_info("Failed to reproduce the bug")
636-
else:
637-
results[-1].set_success()
638-
639-
if not results[-1].is_ok():
640-
results[-1].set_info("Found errors added into Tests results")
629+
# invert result status for bugfix validation
630+
if is_bugfix_validation and test_result:
631+
has_failure = False
632+
for r in test_result.results:
633+
r.set_label("xfail")
634+
if r.status == Result.StatusExtended.FAIL:
635+
r.status = Result.StatusExtended.OK
636+
has_failure = True
637+
elif r.status == Result.StatusExtended.OK:
638+
r.status = Result.StatusExtended.FAIL
639+
if not has_failure:
640+
print("Failed to reproduce the bug")
641+
test_result.set_failed().set_info("Failed to reproduce the bug")
642+
else:
643+
# For bugfix validation, the expected behavior is:
644+
# - At least one test must fail (bug reproduced)
645+
# - The overall Tests result is treated as success in that case
646+
test_result.set_success()
647+
648+
# For bugfix validation, "Check errors" (latest in the list) is only a helper step and
649+
# must not affect the overall job result.
650+
results[-1].set_success()
641651

642652
if JobStages.COLLECT_LOGS in stages:
643653
print("Collect logs")

ci/jobs/integration_test_job.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -445,7 +445,7 @@ def main():
445445
has_error = True
446446
error_info.append(test_result_sequential.info)
447447

448-
# Collect logs before rerun
448+
# Collect logs before re-run
449449
attached_files = []
450450
if not info.is_local_run:
451451
failed_suits = []

ci/jobs/scripts/workflow_hooks/filter_job.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ def should_skip_job(job_name):
5656
global _info_cache
5757
if _info_cache is None:
5858
_info_cache = Info()
59+
print(f"INFO: PR labels: {_info_cache.pr_labels}")
5960

6061
changed_files = _info_cache.get_kv_data("changed_files")
6162
if not changed_files:
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
import re
2+
from ci.praktika.info import Info
3+
4+
if __name__ == "__main__":
5+
info = Info()
6+
if info.pr_number == 0:
7+
# Extract original PR number from backport merge commits
8+
# Example: "Merge pull request #92596 from ClickHouse/backport/25.12/92538" -> extract 92538
9+
commit_message = info.commit_message
10+
match = re.search(r"backport/[^/]+/(\d+)", commit_message)
11+
if match:
12+
try:
13+
pr_number = int(match.group(1))
14+
info.set_parent_pr_number(pr_number)
15+
except ValueError as e:
16+
print(
17+
f"Failed to get PR number from commit message [{commit_message}]: {e}"
18+
)

ci/praktika/_environment.py

Lines changed: 20 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,6 @@ class _Environment(MetaClasses.Serializable):
4343
JOB_CONFIG: Optional[Job.Config] = None
4444
TRACEBACKS: List[str] = dataclasses.field(default_factory=list)
4545
WORKFLOW_JOB_DATA: Dict[str, Any] = dataclasses.field(default_factory=dict)
46-
WORKFLOW_STATUS_DATA: Dict[str, Any] = dataclasses.field(default_factory=dict)
4746
JOB_KV_DATA: Dict[str, Any] = dataclasses.field(default_factory=dict)
4847
COMMIT_AUTHORS: List[str] = dataclasses.field(default_factory=list)
4948
WORKFLOW_CONFIG: Optional[Dict[str, Any]] = None
@@ -72,17 +71,14 @@ def from_env(cls) -> "_Environment":
7271
EVENT_TIME = ""
7372
COMMIT_MESSAGE = ""
7473

75-
assert Path(
76-
Settings.WORKFLOW_JOB_FILE
77-
).is_file(), f"File not found: {Settings.WORKFLOW_JOB_FILE}"
78-
with open(Settings.WORKFLOW_JOB_FILE, "r", encoding="utf8") as f:
79-
WORKFLOW_JOB_DATA = json.load(f)
80-
81-
assert Path(
82-
Settings.WORKFLOW_STATUS_FILE
83-
).is_file(), f"File not found: {Settings.WORKFLOW_STATUS_FILE}"
84-
with open(Settings.WORKFLOW_STATUS_FILE, "r", encoding="utf8") as f:
85-
WORKFLOW_STATUS_DATA = json.load(f)
74+
if Path(Settings.WORKFLOW_JOB_FILE).is_file():
75+
with open(Settings.WORKFLOW_JOB_FILE, "r", encoding="utf8") as f:
76+
WORKFLOW_JOB_DATA = json.load(f)
77+
else:
78+
print(
79+
f"NOTE: Workflow job file [{Settings.WORKFLOW_JOB_FILE}] does not exist"
80+
)
81+
WORKFLOW_JOB_DATA = {}
8682

8783
if EVENT_FILE_PATH:
8884
with open(EVENT_FILE_PATH, "r", encoding="utf-8") as f:
@@ -238,7 +234,6 @@ def from_env(cls) -> "_Environment":
238234
"parent_pr_number": LINKED_PR_NUMBER,
239235
},
240236
WORKFLOW_JOB_DATA=WORKFLOW_JOB_DATA,
241-
WORKFLOW_STATUS_DATA=WORKFLOW_STATUS_DATA,
242237
WORKFLOW_CONFIG=None,
243238
)
244239

@@ -281,9 +276,18 @@ def get(cls):
281276
if Path(cls.file_name_static()).is_file():
282277
return cls.from_fs("environment")
283278
else:
284-
env = cls.from_workflow_data()
285-
env.dump()
286-
return env
279+
try:
280+
env = cls.from_workflow_data()
281+
env.dump()
282+
return env
283+
except FileNotFoundError as e:
284+
# For workflows without Config job
285+
print(
286+
f"NOTE: Workflow context file [{Settings.WORKFLOW_STATUS_FILE}] does not exist - read context from GH event"
287+
)
288+
env = cls.from_env()
289+
env.dump()
290+
return env
287291

288292
def set_job_name(self, job_name):
289293
self.JOB_NAME = job_name

ci/praktika/info.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -163,6 +163,8 @@ def get_secret(self, name):
163163
return self.workflow.get_secret(name)
164164

165165
def get_job_url(self):
166+
if not self.env.WORKFLOW_JOB_DATA:
167+
return ""
166168
return f"{self.env.RUN_URL}/job/{self.env.WORKFLOW_JOB_DATA['check_run_id']}"
167169

168170
def get_job_report_url(self, latest=False):

ci/praktika/native_jobs.py

Lines changed: 19 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -304,6 +304,25 @@ def _check_db(workflow):
304304
)
305305
env.dump()
306306

307+
_GH_Auth(workflow)
308+
309+
# refresh PR data
310+
if env.PR_NUMBER > 0:
311+
title, body, labels = GH.get_pr_title_body_labels()
312+
print(f"NOTE: PR title: {title}")
313+
print(f"NOTE: PR labels: {labels}")
314+
if title:
315+
if title != env.PR_TITLE:
316+
print("PR title has been changed")
317+
env.PR_TITLE = title
318+
if env.PR_BODY != body:
319+
print("PR body has been changed")
320+
env.PR_BODY = body
321+
if env.PR_LABELS != labels:
322+
print("PR labels have been changed")
323+
env.PR_LABELS = labels
324+
env.dump()
325+
307326
if workflow.enable_report:
308327
print("Push pending CI report")
309328
HtmlRunnerHooks.push_pending_ci_report(workflow)
@@ -511,21 +530,6 @@ def check_affected_jobs():
511530
)
512531
)
513532

514-
# refresh PR data
515-
if env.PR_NUMBER > 0:
516-
title, body, labels = GH.get_pr_title_body_labels()
517-
if title:
518-
if title != env.PR_TITLE:
519-
print("PR title has been changed")
520-
env.PR_TITLE = title
521-
if env.PR_BODY != body:
522-
print("PR body has been changed")
523-
env.PR_BODY = body
524-
if env.PR_LABELS != labels:
525-
print("PR labels have been changed")
526-
env.PR_LABELS = labels
527-
env.dump()
528-
529533
if workflow.enable_slack_feed:
530534
if env.PR_NUMBER:
531535
commit_authors = set()

0 commit comments

Comments
 (0)