Skip to content

rgw/notifications: add http request timeout and max inflight#63404

Closed
yuvalif wants to merge 1 commit intoceph:mainfrom
yuvalif:wip-yuval-71390
Closed

rgw/notifications: add http request timeout and max inflight#63404
yuvalif wants to merge 1 commit intoceph:mainfrom
yuvalif:wip-yuval-71390

Conversation

@yuvalif
Copy link
Copy Markdown
Contributor

@yuvalif yuvalif commented May 21, 2025

also make connection timeout configurable

Fixes: https://tracker.ceph.com/issues/71402

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

@yuvalif yuvalif requested a review from a team as a code owner May 21, 2025 15:05
Comment thread src/rgw/driver/rados/rgw_pubsub_push.cc Outdated

static std::unique_ptr<RGWHTTPManager> s_http_manager;
static std::shared_mutex s_http_manager_mutex;
static std::atomic<unsigned> s_http_manager_inflight(0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would probably make more sense for this throttle to be per-endpoint rather than global, wouldn't it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem that was reported here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/MGFY6Q5S7KC2BPHCKQCBM42DJ2O4WOH7/

was a coroutine stack overflow due to large number of open http requests.
this is not per endpoint, so, IMO, we should keep the counter global.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so we want a global throttle to limit how much work we'll schedule at once. is there a reason this only applies to http endpoints though, and not others?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it exists in the kafka and amqp clients from the start :-)
https://github.com/ceph/ceph/blob/main/src/rgw/rgw_kafka.cc#L712

Comment thread src/rgw/driver/rados/rgw_pubsub_push.cc Outdated
Comment on lines +103 to +105
const auto max_inflight = cct->_conf->rgw_http_notif_max_inflight;
if (max_inflight != 0 &&
s_http_manager_inflight >= cct->_conf->rgw_http_notif_max_inflight) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s_http_manager_inflight >= max_inflight since you made a local variable for it

Comment thread src/common/options/rgw.yaml.in Outdated
services:
- rgw
with_legacy: true
- name: rgw_http_notif_message_timeout
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we document these options somewhere? either in a 'config' section in https://docs.ceph.com/en/latest/radosgw/notifications/ or a new 'bucket notification' section similar to https://docs.ceph.com/en/latest/radosgw/config-ref/#topic-persistency-settings

and note that when rendered that way, it only shows the long_desc

@yuvalif yuvalif force-pushed the wip-yuval-71390 branch from 52529fc to 6786098 Compare May 22, 2025 10:55
@yuvalif yuvalif requested a review from a team as a code owner May 22, 2025 10:55
Comment thread doc/radosgw/notifications.rst Outdated
Comment on lines +147 to +156
- ``rgw_http_notif_message_timeout``: This is the maximum time in seconds to deliver a notification.
Delivery error occurs when the message timeout is exceeded.
This value includes the connection time, and hence must be larger than the connection timeout.
If set to zero the http client will wait indefinitely. If not set the default will be 10 seconds.
- ``rgw_http_notif_connection_timeout``: This is the maximum time in seconds to connect to the endpoint.
Delivery error occurs when the message timeout is exceeded.
If set to zero the defaut value of 300 seconds will be used. If not set the default will be 5 seconds.
- ``rgw_http_notif_max_inflight``: This is the maximum number of messages in-flight (across all http endpoints).
Delivery error (BUSY) occurs when the number of messages is exceeded.
If set to zero there is not limit on the number of messages in-flight. If not set the default will be 8192.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a confval directive that makes this much easier. for example, doc/radosgw/config-ref.rst just has:

.. confval:: rgw_frontends
.. confval:: rgw_data
.. confval:: rgw_enable_apis

which renders to https://docs.ceph.com/en/latest/radosgw/config-ref/#ceph-object-gateway-config-reference

this avoids having to duplicate the text between docs and config options

Comment thread src/rgw/driver/rados/rgw_pubsub_push.cc Outdated
s_http_manager_inflight >= max_inflight) {
ldout(cct, 1) << "ERROR: send failed. http endpoint manager busy. in-flight requests: " <<
s_http_manager_inflight << " >= " << max_inflight << dendl;
return -EBUSY;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do the upper levels handle EBUSY? process_entry() returns Failure back to process_queue(). will that just retry immediately?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we keep returning EBUSY back to process_queue(), it will resend the cls_2pc_queue_list_entries() op every time and hammer the osd with reads. once we hit this limit, several process_queue() coroutines may end up doing this at once

i have a feeling that we'd be better off suspending the coroutine here until we're able to schedule this request. that's more complicated though

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case of persistent notifications we retry according to the retry configuration we have there.
we use: rgw_topic_persistency_sleep_duration to space out retries of a specific entry. but we are not looking into the result code.
we are treating all errors in the same way.

would be a nice enhancement to treat EBUSY differently. maybe use larger value for rgw_topic_persistency_sleep_duration?

in case of non-persistent notifications we reply EBUSY to the frontend, which probably convert that 503

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the use of rgw_topic_persistency_sleep_duration in https://github.com/ceph/ceph/blob/458d8862/src/rgw/driver/rados/rgw_notify.cc#L231-L246, that causes process_entry() to return Sleeping but doesn't cause process_queue() itself to sleep

the only sleep i see in process_queue() is for is_idle (https://github.com/ceph/ceph/blob/458d8862/src/rgw/driver/rados/rgw_notify.cc#L383-L389) which is false when the queue has entries to send/retry

so when process_entry() returns Sleeping, process_queue() will immediately loop back and resend cls_2pc_queue_list_entries(), then call process_entry() again with the same result. we don't want to keep sending those listing ops while we're throttled or waiting for retry

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the sleep indication is per entry, and the queue can have many entries in different state.
i guess that one optimization is to check if all entries in the queue are in "sleep" state, we can put the queue processing into sleep state.

note, however, that this is a different issue. the RGW will be busy going through the queue, but will not send the notification to the HTTP server (which was causing the issue).

@yuvalif yuvalif force-pushed the wip-yuval-71390 branch 2 times, most recently from dd9b95a to 4d8f27a Compare May 22, 2025 16:42
@yuvalif yuvalif requested a review from cbodley May 22, 2025 16:46
@yuvalif
Copy link
Copy Markdown
Contributor Author

yuvalif commented May 26, 2025

@cbodley can you please re-review?

@yuvalif
Copy link
Copy Markdown
Contributor Author

yuvalif commented May 27, 2025

this should allow for proper shutdown of the queue handling
code of persistent notifications.

Fixes: https://tracker.ceph.com/issues/71390

Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
@yuvalif
Copy link
Copy Markdown
Contributor Author

yuvalif commented Jun 17, 2025

overwritten the branch by mistake.
will open 2 sepaarte PRs for the http timeout fix and the tokens waiter refactoring

@yuvalif yuvalif closed this Jun 17, 2025
@yuvalif
Copy link
Copy Markdown
Contributor Author

yuvalif commented Jun 23, 2025

replaced by: #64110

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants