Conversation
|
This is an automated comment for commit c6aa12f with description of existing statuses. It's updated for the latest CI running ✅ Click here to open a full report in a separate page Successful checks
|
…se into s3-streams-scheduler
There was a problem hiding this comment.
It is just spaces because I ended up not changing this file. We can revert the changes here if necessary.
| }; | ||
|
|
||
| /// Scoped attach/detach of IO resource links | ||
| struct IOScope : private boost::noncopyable |
There was a problem hiding this comment.
I think we will use this scope eventually for all IO operations including local, hence the name.
There was a problem hiding this comment.
Do we haves cases with only one type of resources attached? Or they are used together always?
There was a problem hiding this comment.
For local IO it could be only read or write.
| } | ||
|
|
||
| ResourceLink link; | ||
| ResourceGuard::Request request; |
There was a problem hiding this comment.
Technically we could have used TLS request as ResourceGuard does and avoid any allocations, but it feels just safer and more straightforward to me.
|
|
||
| // Adjust budget to account for extra consumption of `cost` resource units | ||
| void consumeBudget(ResourceCost cost) | ||
| { | ||
| adjustBudget(0, cost); | ||
| } | ||
|
|
||
| // Adjust budget to account for requested, but not consumed `cost` resource units | ||
| void accumulateBudget(ResourceCost cost) | ||
| { | ||
| adjustBudget(cost, 0); | ||
| } | ||
|
|
||
| /// Enqueue new request to be executed using underlying resource. |
There was a problem hiding this comment.
It is not needed anymore, now logic integrated into ResourceGuard
| chassert(state == Dequeued); | ||
| state = Finished; | ||
| if (estimated_cost != real_cost_) | ||
| link_.queue->adjustBudget(estimated_cost, real_cost_); |
There was a problem hiding this comment.
There was a bug. All ResourceGuard were used in the wrong way passing request->cost into adjustBudget() but enqueueRequestUsingBudget changes request->cost before enqueing the request, so the correct initial estimated cost was not stored anywhere, but it is required for correct adjustment of the budget. Now we store it in ResourceGuard and it's usage is much simpler and less error-prone.
|
|
||
| ResourceCost cost = request.GetContentLength(); | ||
| ResourceGuard rlock(write_settings.resource_link, cost); | ||
| CurrentThread::IOScope io_scope(write_settings.io_scheduling); |
There was a problem hiding this comment.
Here you are controlling only particular current tread with the exact call. In other words you enable scheduling only for PUT (single object + multi parts) and GET (get object).
Also you manually set resources guards for Azure and HDFS.
But what about the rest? I expected that you cover here HTTP storages as well with other types of requests.
There was a problem hiding this comment.
BTW, do you intend to control communication with clickhouse disks or with (S3|Azure|Web)Functions as well?
There was a problem hiding this comment.
Yes, it should cover all remote and local IO eventually. In this PR I'm just trying to improve what we already have. Let's also do the following in a separate PR:
- move
CurrentThread::IOScopeinto ObjectStorage - make IOScope inheritable when spawning new threads
- get rid of
ReadSettingsandWriteSettingsfields for passingio_schedulingfield
It should make the process of integration other buffers and scheduler more straightforward
|
@maxknv Can we help build a check to finish? How can I find out what when wrong? |
package_aarch64 failed with OOM. - I suggest restarting it manually |
|
Looks like we have data race here now https://pastila.nl/?000777c6/1c25eebbf78952d5ed8d7c42cbcca009#or9ialwCHfjZi17MNTG5Hg== |
| } | ||
| response_stream = nullptr; | ||
| Session::setSendDataHooks(); | ||
| Session::setReceiveDataHooks(); |
There was a problem hiding this comment.
This reset lead to a data race: https://pastila.nl/?000777c6/1c25eebbf78952d5ed8d7c42cbcca009#or9ialwCHfjZi17MNTG5Hg==
So we cannot reset it here, because session could be still in use for some reason. I think I should return to the logic I used a few revisions earlier: reset hooks when the connection is reassigned to another request in sendRequest().
There was a problem hiding this comment.
It turned out to be correct. Data race was inside ResourceGuard
|
|
data race is not fixed. it just moved few lines below to a different location https://pastila.nl/?000d4920/623b6fe76e20a416a7c021a625ec3d4b#BDtMc1k+7LXEcMezAHBJhA== |
|
Now data race should be fixed. It was in void execute() override
{
{
std::unique_lock lock(mutex);
chassert(state == Enqueued);
state = Dequeued;
}
dequeued_cv.notify_one(); // This optimization is TOTALLY WRONG. it must be under lock
} |
Fixes #63213
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
IO scheduling for remote S3 disks is now done on the level of HTTP socket streams (instead of the whole S3 requests) to resolve
bandwidth_limitthrottling issues.Documentation entry for user-facing changes
CI Settings (Only check the boxes if you know what you are doing):