Skip to content

Commit efe536a

Browse files
Merge master into fix-filter-pushdown-join-use-nulls-legacy
2 parents 26ba23c + c004e83 commit efe536a

37 files changed

Lines changed: 553 additions & 236 deletions

.claude/skills/build/SKILL.md

Lines changed: 8 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ name: build
33
description: Build ClickHouse with various configurations (Release, Debug, ASAN, TSAN, etc.). Use when the user wants to compile ClickHouse.
44
argument-hint: "[build-type] [target] [options]"
55
disable-model-invocation: false
6-
allowed-tools: Task, Bash(ninja:*), Bash(cd:*), Bash(ls:*), Bash(pgrep:*), Bash(ps:*), Bash(pkill:*), Bash(mktemp:*), Bash(sleep:*)
6+
allowed-tools: Task, Bash(ninja:*), Bash(cd:*), Bash(ls:*), Bash(pgrep:*), Bash(ps:*), Bash(pkill:*), Bash(sleep:*)
77
---
88

99
# ClickHouse Build Skill
@@ -77,22 +77,19 @@ Build ClickHouse in `build` or `build_debug`, `build_asan`, `build_tsan`, `build
7777

7878
2. **Create log file and start the build:**
7979

80-
**Step 2a: Create temporary log file first:**
81-
```bash
82-
mktemp /tmp/build_clickhouse_XXXXXX.log
83-
```
84-
- This will print the log file path
80+
**Step 2a: Determine log file path:**
81+
- Use `[build_directory]/build_output.log` as the log file path
8582
- **IMMEDIATELY report to the user:**
86-
- "Build logs will be written to: [log file path]"
83+
- "Build logs will be written to: `[build_directory]/build_output.log`"
8784
- Then display in a copyable code block:
8885
```bash
89-
tail -f [log file path]
86+
tail -f [build_directory]/build_output.log
9087
```
9188
- Example: "You can monitor the build in real-time with:" followed by the tail command in a code block
9289

9390
**Step 2b: Start the ninja build:**
9491
```bash
95-
cd [build_directory] && ninja [target] > [log file path] 2>&1
92+
cd [build_directory] && ninja [target] > build_output.log 2>&1
9693
```
9794
Where `[build_directory]` is the path found in step 1a.
9895

@@ -116,7 +113,7 @@ Build ClickHouse in `build` or `build_debug`, `build_asan`, `build_tsan`, `build
116113
**ALWAYS use Task tool to analyze results** (both success and failure):
117114
- Use Task tool with `subagent_type=general-purpose` to analyze the build output
118115
- **Pass the log file path from step 2a** to the Task agent - let it read the file directly
119-
- Example Task prompt: "Read and analyze the build output from: /tmp/build_clickhouse_abc123.log"
116+
- Example Task prompt: "Read and analyze the build output from: [build_directory]/build_output.log"
120117
- The Task agent should read the file and provide:
121118

122119
**If build succeeds:**
@@ -246,5 +243,5 @@ Build ClickHouse in `build` or `build_debug`, `build_asan`, `build_tsan`, `build
246243
- **MANDATORY:** After successful builds, this skill MUST check for running ClickHouse servers and ask the user if they want to stop them to use the new build
247244
- **MANDATORY:** ALL build output (success or failure) MUST be analyzed by a Task agent with `subagent_type=general-purpose`
248245
- **MANDATORY:** ALWAYS provide a final summary to the user at the end of the skill execution (step 6)
249-
- **CRITICAL:** Build output is redirected to a unique log file created with `mktemp`. The log file path is reported to the user in a copyable format BEFORE starting the build, allowing real-time monitoring with `tail -f`. The log file path is saved from step 2a and passed to the Task agent for analysis. This keeps large build logs out of the main context.
246+
- **CRITICAL:** Build output is redirected to `build_output.log` inside the build directory. The log file path is reported to the user in a copyable format BEFORE starting the build, allowing real-time monitoring with `tail -f`. The log file path is saved from step 2a and passed to the Task agent for analysis. This keeps large build logs out of the main context.
250247
- **Subagents available:** Task tool is used to analyze all build output (by reading from output file) and provide concise summaries. Additional agents (Explore or general-purpose) can be used for deeper investigation of complex build errors

.claude/skills/test/SKILL.md

Lines changed: 14 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ name: test
33
description: Run ClickHouse stateless or integration tests. Use when the user wants to run or execute tests.
44
argument-hint: "[test-name] [--flags]"
55
disable-model-invocation: false
6-
allowed-tools: Task, Bash(./tests/clickhouse-test:*), Bash(pgrep:*), Bash(./build/*/programs/clickhouse:*), Bash(./build*/programs/clickhouse:*), Bash(python:*), Bash(python3:*), Bash(mktemp:*), Bash(export:*), Bash(ls:*), Bash(test:*)
6+
allowed-tools: Task, Bash(./tests/clickhouse-test:*), Bash(pgrep:*), Bash(./build/*/programs/clickhouse:*), Bash(./build*/programs/clickhouse:*), Bash(python:*), Bash(python3:*), Bash(export:*), Bash(ls:*), Bash(test:*)
77
---
88

99
# ClickHouse Test Runner Skill
@@ -125,23 +125,20 @@ The **build directory** is the path up to and including the parent of `programs/
125125

126126
3. **Create log file and run the stateless test:**
127127

128-
**Step 3a: Create temporary log file first:**
129-
```bash
130-
mktemp /tmp/test_clickhouse_XXXXXX.log
131-
```
132-
- This will print the log file path
128+
**Step 3a: Determine log file path:**
129+
- Use `[build_directory]/test_output.log` as the log file path
133130
- **IMMEDIATELY report to the user:**
134-
- "Test logs will be written to: [log file path]"
131+
- "Test logs will be written to: `[build_directory]/test_output.log`"
135132
- Then display in a copyable code block:
136133
```bash
137-
tail -f [log file path]
134+
tail -f [build_directory]/test_output.log
138135
```
139136
- Example: "You can monitor the test progress in real-time with:" followed by the tail command in a code block
140137

141138
**Step 3b: Start the stateless test:**
142139
```bash
143140
# Add clickhouse binary to PATH using auto-detected build directory
144-
export PATH="./[build_directory]/programs:$PATH" && ./tests/clickhouse-test <test_name> [flags] > [log file path] 2>&1
141+
export PATH="./[build_directory]/programs:$PATH" && ./tests/clickhouse-test <test_name> [flags] > [build_directory]/test_output.log 2>&1
145142
```
146143
Where `[build_directory]` is the path found during auto-detection.
147144

@@ -173,22 +170,19 @@ The **build directory** is the path up to and including the parent of `programs/
173170

174171
2. **Create log file and run the integration test:**
175172

176-
**Step 2a: Create temporary log file first:**
177-
```bash
178-
mktemp /tmp/test_clickhouse_XXXXXX.log
179-
```
180-
- This will print the log file path
173+
**Step 2a: Determine log file path:**
174+
- Use `[build_directory]/test_output.log` as the log file path
181175
- **IMMEDIATELY report to the user:**
182-
- "Test logs will be written to: [log file path]"
176+
- "Test logs will be written to: `[build_directory]/test_output.log`"
183177
- Then display in a copyable code block:
184178
```bash
185-
tail -f [log file path]
179+
tail -f [build_directory]/test_output.log
186180
```
187181
- Example: "You can monitor the test progress in real-time with:" followed by the tail command in a code block
188182

189183
**Step 2b: Start the integration test with praktika:**
190184
```bash
191-
python -u -m ci.praktika run "integration" --test <test_name> [--path <absolute_binary_path>] > [log file path] 2>&1
185+
python -u -m ci.praktika run "integration" --test <test_name> [--path <absolute_binary_path>] > [build_directory]/test_output.log 2>&1
192186
```
193187

194188
**Important:**
@@ -220,7 +214,7 @@ The **build directory** is the path up to and including the parent of `programs/
220214
**ALWAYS use Task tool to analyze results** (both pass and fail):
221215
- Use Task tool with `subagent_type=general-purpose` to analyze the test output
222216
- **Pass the log file path from step 3a** to the Task agent - let it read the file directly
223-
- Example Task prompt: "Read and analyze the test output from: /tmp/test_clickhouse_abc123.log"
217+
- Example Task prompt: "Read and analyze the test output from: [build_directory]/test_output.log"
224218
- The Task agent should read the file and provide:
225219

226220
**If tests passed:**
@@ -273,7 +267,7 @@ The **build directory** is the path up to and including the parent of `programs/
273267
**ALWAYS use Task tool to analyze results** (both pass and fail):
274268
- Use Task tool with `subagent_type=general-purpose` to analyze the test output
275269
- **Pass the log file path from step 2a** to the Task agent - let it read the file directly
276-
- Example Task prompt: "Read and analyze the test output from: /tmp/test_clickhouse_abc123.log"
270+
- Example Task prompt: "Read and analyze the test output from: [build_directory]/test_output.log"
277271
- The Task agent should read the file and provide:
278272

279273
**If tests passed:**
@@ -360,7 +354,7 @@ The test runner automatically detects and sets the necessary environment variabl
360354
- Test type is automatically detected based on name pattern or file location
361355
- **MANDATORY:** ALL test output (success or failure) MUST be analyzed by a Task agent with `subagent_type=general-purpose`
362356
- **MANDATORY:** For test failures, MUST prompt user if they want deeper analysis and use Task subagent if requested
363-
- **CRITICAL:** Test output is redirected to a unique log file created with `mktemp`. The log file path is reported to the user in a copyable format BEFORE starting the test, allowing real-time monitoring with `tail -f`. The log file path is saved and passed to the Task agent for analysis. This keeps large test logs out of the main context.
357+
- **CRITICAL:** Test output is redirected to `test_output.log` inside the build directory. The log file path is reported to the user in a copyable format BEFORE starting the test, allowing real-time monitoring with `tail -f`. The log file path is saved and passed to the Task agent for analysis. This keeps large test logs out of the main context.
364358
- **Subagents available:** Task tool is used to analyze all test output (by reading from log file) and provide concise summaries. Additional agents (Explore or general-purpose) are used for deeper investigation of test failures when user requests it
365359

366360
### Stateless Tests

.claude/tools/fetch_ci_report.js

Lines changed: 51 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,13 @@ function constructJsonUrl(baseUrl, suffix, sha, taskName) {
176176
return `${baseUrl}/${suffix}/${encodeURIComponent(sha)}/result_${normalizedTask}.json`;
177177
}
178178

179+
/**
180+
* Check if a status represents a failure
181+
*/
182+
function isFailureStatus(status) {
183+
return status === 'failed' || status === 'FAIL' || status === 'failure';
184+
}
185+
179186
/**
180187
* Parse test results from the JSON data
181188
*/
@@ -192,13 +199,23 @@ function parseTestResults(jsonData) {
192199
// Nested results
193200
extractTests(result.results, prefix ? `${prefix}/${result.name}` : result.name);
194201
} else {
195-
// Leaf result - this is a test
202+
// Leaf result - this is a test or build step
196203
const test = {
197204
name: prefix ? `${prefix}/${result.name}` : result.name,
198205
status: result.status || 'UNKNOWN',
199206
duration: result.duration || 0
200207
};
201208

209+
// Include info field (contains build log tail for build failures)
210+
if (result.info) {
211+
test.info = result.info;
212+
}
213+
214+
// Include links from this result
215+
if (result.links && result.links.length > 0) {
216+
test.links = result.links;
217+
}
218+
202219
// Extract CIDB links from ext.hlabels
203220
if (result.ext && result.ext.hlabels) {
204221
const cidbLinks = [];
@@ -391,7 +408,7 @@ async function fetchReport(inputUrl, options = {}) {
391408
}
392409

393410
const { testResults = [] } = result;
394-
const failed = testResults.filter(t => t.status === 'failed' || t.status === 'FAIL');
411+
const failed = testResults.filter(t => isFailureStatus(t.status));
395412
const passed = testResults.filter(t => t.status === 'success' || t.status === 'OK');
396413
const skipped = testResults.filter(t => t.status === 'skipped' || t.status === 'SKIPPED');
397414

@@ -418,6 +435,20 @@ async function fetchReport(inputUrl, options = {}) {
418435
console.log(` 📊 CIDB: ${cidbLink}`);
419436
}
420437
}
438+
if (test.links && test.links.length > 0) {
439+
for (const link of test.links) {
440+
console.log(` 🔗 ${link}`);
441+
}
442+
}
443+
if (test.info) {
444+
const lines = test.info.split('\n').filter(l => l.trim());
445+
const tail = lines.slice(-30);
446+
console.log(' --- log tail ---');
447+
for (const line of tail) {
448+
console.log(` ${line}`);
449+
}
450+
console.log(' --- end ---');
451+
}
421452
}
422453
}
423454
console.log();
@@ -516,7 +547,7 @@ async function fetchReport(inputUrl, options = {}) {
516547
// For multi-report mode, don't filter by failed here - we'll show all in summary
517548
if (options.failedOnly && !options.isSingleReport) {
518549
filteredResults = filteredResults.filter(t =>
519-
t.status === 'failed' || t.status === 'FAIL'
550+
isFailureStatus(t.status)
520551
);
521552
}
522553

@@ -528,21 +559,36 @@ async function fetchReport(inputUrl, options = {}) {
528559
// Print results for standalone report
529560
console.log('=== Test Results ===\n');
530561

531-
const failed = filteredResults.filter(t => t.status === 'failed' || t.status === 'FAIL');
562+
const failed = filteredResults.filter(t => isFailureStatus(t.status));
532563
const passed = filteredResults.filter(t => t.status === 'success' || t.status === 'OK');
533564
const skipped = filteredResults.filter(t => t.status === 'skipped' || t.status === 'SKIPPED');
534565

535566
console.log(`Total: ${filteredResults.length} | ✅ Passed: ${passed.length} | ❌ Failed: ${failed.length} | ⏭️ Skipped: ${skipped.length}\n`);
536567

537568
if (failed.length > 0) {
538-
console.log('--- Failed Tests ---');
569+
console.log('--- Failures ---');
539570
for (const test of failed) {
540571
console.log(`❌ FAIL ${test.name} (${test.duration}s)`);
541572
if (options.showCidb && test.cidbLinks && test.cidbLinks.length > 0) {
542573
for (const cidbLink of test.cidbLinks) {
543574
console.log(` 📊 CIDB: ${cidbLink}`);
544575
}
545576
}
577+
if (test.links && test.links.length > 0) {
578+
for (const link of test.links) {
579+
console.log(` 🔗 ${link}`);
580+
}
581+
}
582+
if (test.info) {
583+
// Show last 30 non-empty lines of info (build log tail with actual errors)
584+
const lines = test.info.split('\n').filter(l => l.trim());
585+
const tail = lines.slice(-30);
586+
console.log(' --- log tail ---');
587+
for (const line of tail) {
588+
console.log(` ${line}`);
589+
}
590+
console.log(' --- end ---');
591+
}
546592
}
547593
console.log('');
548594
}

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
- Backward Incompatible Change
77
- Build/Testing/Packaging Improvement
88
- Documentation (changelog entry is not required)
9-
- Critical Bug Fix (crash, data loss, RBAC) or LOGICAL_ERROR
9+
- Critical Bug Fix (crash, data loss, RBAC)
1010
- Bug Fix (user-visible misbehavior in an official stable release)
1111
- CI Fix or Improvement (changelog entry is not required)
1212
- Not for changelog (changelog entry is not required)

.github/workflows/retry_infra_failures.yml

Lines changed: 56 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,11 @@ jobs:
4949
run_url="https://github.com/$GH_REPO/actions/runs/$run_id"
5050
echo "Checking run $run_url ..."
5151
52+
should_rerun=false
53+
54+
# Fetch all job data once (reused for multiple checks below)
55+
jobs_raw=$(gh api "repos/$GH_REPO/actions/runs/$run_id/jobs?per_page=100" --paginate)
56+
5257
# Collect per-job verdicts across all pages (runs can have >100 jobs).
5358
# Each failed job emits "true" (infrastructure) or "false" (real failure).
5459
# A job is considered an infrastructure failure if:
@@ -60,33 +65,65 @@ jobs:
6065
# - the "Run" step failed almost immediately (under 2 minutes), indicating
6166
# a setup/download issue (e.g. missing S3 credentials) rather than a real
6267
# test failure
63-
verdicts=$(gh api "repos/$GH_REPO/actions/runs/$run_id/jobs?per_page=100" \
64-
--paginate --jq '
65-
.jobs[] | select(.conclusion == "failure") |
66-
if .name == "Config Workflow" or .name == "Finish Workflow" then empty
67-
else
68-
[.steps[] | select(.name == "Run")] |
69-
if length == 0 then true
70-
elif .[0].conclusion == "skipped" then true
71-
elif .[0].conclusion == "failure" then
72-
((.[0].completed_at | fromdateiso8601) -
73-
(.[0].started_at | fromdateiso8601)) < 120
74-
else false
75-
end
68+
verdicts=$(echo "$jobs_raw" | jq -r '
69+
.jobs[] | select(.conclusion == "failure") |
70+
if .name == "Config Workflow" or .name == "Finish Workflow" then empty
71+
else
72+
[.steps[] | select(.name == "Run")] |
73+
if length == 0 then true
74+
elif .[0].conclusion == "skipped" then true
75+
elif .[0].conclusion == "failure" then
76+
((.[0].completed_at | fromdateiso8601) -
77+
(.[0].started_at | fromdateiso8601)) < 120
78+
else false
7679
end
77-
')
80+
end
81+
')
7882
7983
# Infrastructure failure = at least one failed job, and all of them are infra
8084
if [ -z "$verdicts" ]; then
81-
is_infra=false
85+
:
8286
elif echo "$verdicts" | grep -q "false"; then
83-
is_infra=false
87+
:
8488
else
85-
is_infra=true
89+
should_rerun=true
90+
echo " Infrastructure failure detected."
91+
fi
92+
93+
# Check if "Config Workflow" failed in its "Run" step (e.g. due to
94+
# pr_labels_and_category.py rejecting the changelog category). If the PR
95+
# description was edited after the failure (same HEAD commit, so no new
96+
# workflow run was triggered), re-run to pick up the fix.
97+
if [ "$should_rerun" = "false" ]; then
98+
config_failed_at=$(echo "$jobs_raw" | jq -r '
99+
[.jobs[]
100+
| select(.name == "Config Workflow" and .conclusion == "failure")
101+
| .steps[] | select(.name == "Run" and .conclusion == "failure")
102+
| .completed_at
103+
] | first // empty
104+
')
105+
106+
if [ -n "$config_failed_at" ]; then
107+
run_data=$(gh api "repos/$GH_REPO/actions/runs/$run_id" \
108+
--jq '{pr: .pull_requests[0].number, sha: .head_sha}')
109+
pr_number=$(echo "$run_data" | jq -r '.pr // empty')
110+
run_sha=$(echo "$run_data" | jq -r '.sha')
111+
112+
if [ -n "$pr_number" ]; then
113+
pr_data=$(gh api "repos/$GH_REPO/pulls/$pr_number" \
114+
--jq '{sha: .head.sha, updated: .updated_at}')
115+
pr_sha=$(echo "$pr_data" | jq -r '.sha')
116+
pr_updated=$(echo "$pr_data" | jq -r '.updated')
117+
118+
if [ "$run_sha" = "$pr_sha" ] && [[ "$pr_updated" > "$config_failed_at" ]]; then
119+
should_rerun=true
120+
echo " Config Workflow failed but PR #$pr_number was updated after — rerunning."
121+
fi
122+
fi
123+
fi
86124
fi
87125
88-
if [ "$is_infra" = "true" ]; then
89-
echo " Infrastructure failure detected, rerunning..."
126+
if [ "$should_rerun" = "true" ]; then
90127
if gh run rerun "$run_id" --repo "$GH_REPO"; then
91128
rerun_count=$((rerun_count + 1))
92129
echo " Triggered rerun: $run_url/attempts/2"

ci/jobs/integration_test_job.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -652,6 +652,14 @@ def main():
652652

653653
failed_test_cases = []
654654

655+
# Clear dmesg to avoid false OOM detection from previous CI jobs on the same host.
656+
# Do this only in CI (non-local runs) and via a non-interactive privileged helper.
657+
if not info.is_local_run:
658+
try:
659+
Utils.clear_dmesg()
660+
except Exception as ex:
661+
print(f"Failed to clear dmesg before integration tests: {ex}")
662+
655663
if parallel_test_modules:
656664
for attempt in range(module_repeat_cnt):
657665
log_file = f"{temp_path}/pytest_parallel.log"

0 commit comments

Comments
 (0)