Make ClickHouse behaviour on multiple read from pipe more understandable#25960
Make ClickHouse behaviour on multiple read from pipe more understandable#25960alexey-milovidov merged 23 commits intoClickHouse:masterfrom
Conversation
azat
left a comment
There was a problem hiding this comment.
Consider adding a test, smth like:
clickhouse local --structure 'key String' -q 'select * from table; select * from table;' <<<foo # fails with pipeline error
echo foo > /tmp/foo
./clickhouse local --structure 'key String' -q 'select * from table; select * from table;' --file /tmp/foo # ok
mkfifo /tmp/pipe
echo foo > /tmp/pipe &
./clickhouse local --structure 'key String' -q 'select * from table; select * from table;' --file /tmp/pipe # hangs because it opens via path, and it will hang for the second open of pipe, since there is no write endpoint.
Also right now the following does not works, while it should:
./clickhouse local --structure 'key String' -q 'select * from table; select * from table;' --file <(echo foo)And the following does not work because of additional check in ITableFunctionFileLike::parseArguments
./clickhouse local --stacktrace -q "select * from file(0, 'TSV', 'key String');" <<<foo|
Tried to reproduce the problem with
As it turns out |
Right.
Actually thinking about this a little bit more, and it should not accept custom fd, since this a security vulnerability (since any user may read from any file that the process keeps open) |
Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>
46787fa to
a34c23b
Compare
| # shellcheck source=../shell_config.sh | ||
| . "$CURDIR"/../shell_config.sh | ||
|
|
||
| SAMPLE_FILE="$CURDIR/01947_sample_data.csv" |
There was a problem hiding this comment.
Because it uses one file, it is not possible to run flaky check (runs tests in parallel multiple times) - https://clickhouse-test-reports.s3.yandex.net/25960/bab7681bc18fbaf7df49737ca72cfe70756a5f62/functional_stateless_tests_flaky_check_(address)/test_run.txt.out.log
To fix this, you can simply use mktemp, i.e. mktemp --tmpdir "$CURDIR" 01947_multiple_pipe_read_saple_data_XXXXXX.sh (completely untested, but you can check it using clickhouse-test --test-runs 100 01947_multiple_pipe_read. --jobs 8)
| . "$CURDIR"/../shell_config.sh | ||
|
|
||
| SAMPLE_FILE="$CURDIR/01947_sample_data.csv" | ||
| STD_ERROR_CAPTURED="$CURDIR/01947_std_error_captured.log" |
src/Storages/StorageFile.cpp
Outdated
| else | ||
| { | ||
| // else if fd is not a regular file, then throw an Exception | ||
| throw Exception("Cannot read from a pipeline twice", ErrorCodes::CANNOT_READ_FROM_FILE_DESCRIPTOR); |
There was a problem hiding this comment.
BTW why do you call this pipeline not pipe?
There was a problem hiding this comment.
Changed error message
|
|
||
| echo '******************' | ||
| echo 'Attempt to read twice from a pipeline' | ||
| ${CLICKHOUSE_LOCAL} --structure 'key String' -q 'select * from table; select * from table;' <<<foo |
There was a problem hiding this comment.
Also it is better not to use this in tests, since the behavior is different in different versions on bash:
$ echo -n "$BASH_VERSION: "; test -p /dev/fd/0 <<<foo && echo pipe || echo regular
5.1.8(1)-release: pipe
$ docker run --rm -it ubuntu:20.04 bash -c 'echo -n "$BASH_VERSION: "; test -p /dev/fd/0 <<<foo && echo pipe || echo regular'
5.0.17(1)-release: regular
You already have test with pipe (echo 1 | clickhouse-local), this is enough.
There was a problem hiding this comment.
Removed that test
Merging #25960 (Bolonini/read_from_file)
|
Merged in #26777. |
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
If file descriptor in
Filetable is regular file - allow to read multiple times from it. It allowsclickhouse-localto read multiple times from stdin (with multiple SELECT queries or subqueries) if stdin is a regular file likeclickhouse-local --query "SELECT * FROM table UNION ALL SELECT * FROM table" ... < file. This closes #11124. Co-authored with @alexey-milovidov.