Add server UUID for disks access checks (read/read-by-offset/write/delete) to avoid possible races#43143
Conversation
|
It should be added to all disk directories, not only s3. |
You mean Azure disk, or all object storages, or all disks? What you have in mind? P.S. for now there is only checks for S3 and Azure. |
|
All disks. |
28b6ec7 to
b678097
Compare
46fc609 to
a0deb94
Compare
|
3926691 to
1c751fb
Compare
|
Failures unrelated:
And something with MergeTree
|
Otherwise, if you are lucky enough, race condition is possible, and you can get some errors because one server already removed the file while another was trying to read it. But it was possible only for: - batch deletes check for "s3" disk - and all checks for "s3_plain" disk, since this disk does not uses random names Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Previously we had such access (read/write/delete) checks only for S3 and Azure disks (read/read-by-offset/write/delete), this patch adds check for all disks. Also I've added the disk name into IDisk interface, since it is required for the error message in IDisk::checkAccess(), but I have to add DiskEncrypted::encrypted_name due DiskEncrypted inherits from DiskDecorator not from IDisk, and so does not have ability to set disk name (DiskEncrypted could pass the disk name to the DiskDecorator, but it is not used anywere, and besides this will require to duplicate the name for each user of DiskDecorator). Also from nwo on, skip_access_check will make sense for every disk, plus now startup() called for each disk (before it was missed for some of them). And I've added skip_access_check as as a member for DiskRestartProxy, since it calls startup() on restart(). Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
This will fix clickhouse-disks Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
stacktrace:
contrib/libcxx/src/condition_variable.cpp:47::std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&)
contrib/libcxx/src/shared_mutex.cpp:65::std::__1::shared_timed_mutex::lock_shared()
src/Disks/DiskRestartProxy.cpp:229::DB::DiskRestartProxy::writeFile(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, unsigned long, DB::WriteMode, DB::WriteSettings const&)
src/Disks/IDisk.cpp:0::DB::IDisk::checkAccessImpl(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&)
contrib/libcxx/include/string:1499::DB::IDisk::checkAccess()
src/Disks/IDisk.cpp:0::DB::IDisk::startup(std::__1::shared_ptr<DB::Context const>, bool)
src/Disks/DiskRestartProxy.cpp:375::DB::DiskRestartProxy::restart(std::__1::shared_ptr<DB::Context const>)
contrib/libcxx/include/__memory/shared_ptr.h:701::DB::InterpreterSystemQuery::restartDisk(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&)
src/Interpreters/InterpreterSystemQuery.cpp:508::DB::InterpreterSystemQuery::execute()
src/Interpreters/executeQuery.cpp:0::DB::executeQueryImpl(char const*, char const*, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum, DB::ReadBuffer*)
src/Interpreters/executeQuery.cpp:1083::DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum)
src/Server/TCPHandler.cpp:0::DB::TCPHandler::runImpl()
src/Server/TCPHandler.cpp:1904::DB::TCPHandler::run()
contrib/poco/Net/src/TCPServerConnection.cpp:57::Poco::Net::TCPServerConnection::start()
contrib/libcxx/include/__memory/unique_ptr.h:48::Poco::Net::TCPServerDispatcher::run()
contrib/poco/Foundation/src/ThreadPool.cpp:213::Poco::PooledThread::run()
contrib/poco/Foundation/include/Poco/SharedPtr.h:156::Poco::ThreadImpl::runnableEntry(void*)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
…r.py) Right now cluster.py first creates HDFS and then creates clickhouse in one API call, so you cannot interract and add missing mkdirs for the clickhouse, fix this by using root dir where it is possible.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
But it wasn't a problem before, since before it was not possible to have readonly==true. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
1c751fb to
c393af2
Compare
|
Test failures looks unrelated:
Looks like slow parts removal, see also #43277 (comment) |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add server UUID for disks access checks (read/read-by-offset/write/delete) to avoid possible races. Plus now disks checks will be done for all disks, and not only S3/Azure.
Otherwise, if you are lucky enough, race condition is possible, and you can get some errors because one server already removed the file while another was trying to read it.
But it was possible only for:
Fixes: #37791