Skip to content

Fix infinite loop in skipPrefixBeforeHeader when file has fewer lines than skip_first_lines#101111

Open
alexey-milovidov wants to merge 11 commits intomasterfrom
fix/csv-tsv-skip-first-lines-infinite-loop
Open

Fix infinite loop in skipPrefixBeforeHeader when file has fewer lines than skip_first_lines#101111
alexey-milovidov wants to merge 11 commits intomasterfrom
fix/csv-tsv-skip-first-lines-infinite-loop

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

When input_format_csv_skip_first_lines or input_format_tsv_skip_first_lines is set to a value larger than the number of lines in the file, skipPrefixBeforeHeader enters an infinite loop doing pread at EOF. Each call to readRow succeeds with an empty result even at EOF, so the loop never terminates.

Add an EOF check before each readRow call to break out of the loop when the file is exhausted.

Found via AST fuzzer in Stress test (amd_msan) on PR #100947.

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

…es than `skip_first_lines`

When `input_format_csv_skip_first_lines` or `input_format_tsv_skip_first_lines`
is set to a value larger than the number of lines in the file,
`skipPrefixBeforeHeader` enters an infinite loop doing `pread` at EOF.
Each call to `readRow` succeeds with an empty result even at EOF, so the
loop never terminates.

Add an EOF check before each `readRow` call to break out of the loop
when the file is exhausted.

Found via AST fuzzer in Stress test (amd_msan) on PR #100947.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 29, 2026

Workflow [PR], commit [53dd622]

Summary:

job_name test_name status info comment
Stateless tests (amd_debug, parallel) failure
02845_join_on_cond_sparse FAIL cidb
Integration tests (amd_llvm_coverage, 2/5) failure
test_overcommit_tracker/test.py::test_user_overcommit FAIL cidb
Stress test (amd_msan) failure
Logical error: Shard number is greater than shard count: shard_num=A shard_count=B cluster=C (STID: 5066-457d) FAIL cidb
Stress test (arm_msan) failure
MemorySanitizer: use-of-uninitialized-value (STID: 2410-543a) FAIL cidb
Finish Workflow failure
python3 ./ci/jobs/scripts/workflow_hooks/pr_body_check.py failure

AI Review

Summary

This PR fixes a real infinite-loop risk in skipPrefixBeforeHeader for CSV/TSV when skip_first_lines exceeds the number of available lines. The added buf->eof guard before readRow in both readers is a direct, low-risk fix, and the new stateless test covers the reported EOF scenario for both formats. I did not find additional blocker/major issues in the changed code.

ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Mar 29, 2026
@@ -0,0 +1,7 @@
-- Verify that skip_first_lines doesn't cause an infinite loop when the file has fewer lines than requested.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not reproduce.

alexey-milovidov and others added 6 commits March 29, 2026 20:17
… loop

The `format()` function with inline data uses `ReadBufferFromMemory`,
which doesn't trigger the pread-based infinite loop. Use `file()` table
function instead, which reads via `ReadBufferFromFileDescriptorPRead`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With `skip_first_lines = 100`, the loop over `readRow` at EOF completes
quickly even on unfixed builds.  Bumping to 1 billion makes the loop take
long enough for the test-runner timeout to detect the hang.

This addresses the bugfix-validation failure ("does not reproduce").
The .sql test does not reproduce the bug in bugfix validation because the
loop in `skipPrefixBeforeHeader` (1B iterations of a fast no-op at EOF)
completes within the test timeout, producing the same empty output on both
old and new code.

Convert to a .sh test that wraps the query in `timeout 10`. On unfixed
builds, the 1B-iteration loop takes much longer than 10 seconds, so
`timeout` kills the process with a non-zero exit code. On fixed builds,
the EOF check breaks the loop instantly and the query succeeds with exit
code 0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Test number 04068 was already taken by `04068_constant_fold_union_intersect`
on master.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@@ -0,0 +1,20 @@
#!/usr/bin/env bash
# Tags: no-parallel
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Tags: no-parallel looks unnecessary here and will reduce stateless test parallelism in CI. This test already uses ${CLICKHOUSE_DATABASE}-scoped file names, so it should be isolated across parallel runs.

Please remove the no-parallel tag unless there is a concrete cross-test interference that requires serialization.

alexey-milovidov and others added 4 commits March 30, 2026 15:45
The test uses `${CLICKHOUSE_DATABASE}`-scoped file names, so it is
already isolated across parallel runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 31, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.00% -0.10%
Functions 90.90% 90.90% +0.00%
Branches 76.70% 76.60% -0.10%

Changed lines: 100.00% (16/16) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant