rgw/notifications: add http request timeout and max inflight by yuvalif · Pull Request #63404 · ceph/ceph

yuvalif · 2025-05-21T15:05:02Z

also make connection timeout configurable

Fixes: https://tracker.ceph.com/issues/71402

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins test classic perf Jenkins Job | Jenkins Job Definition
jenkins test crimson perf Jenkins Job | Jenkins Job Definition
jenkins test signed Jenkins Job | Jenkins Job Definition
jenkins test make check Jenkins Job | Jenkins Job Definition
jenkins test make check arm64 Jenkins Job | Jenkins Job Definition
jenkins test submodules Jenkins Job | Jenkins Job Definition
jenkins test dashboard Jenkins Job | Jenkins Job Definition
jenkins test dashboard cephadm Jenkins Job | Jenkins Job Definition
jenkins test api Jenkins Job | Jenkins Job Definition
jenkins test docs ReadTheDocs | Github Workflow Definition
jenkins test ceph-volume all Jenkins Jobs | Jenkins Jobs Definition
jenkins test windows Jenkins Job | Jenkins Job Definition
jenkins test rook e2e Jenkins Job | Jenkins Job Definition

cbodley · 2025-05-21T15:24:31Z


 static std::unique_ptr<RGWHTTPManager> s_http_manager;
 static std::shared_mutex s_http_manager_mutex;
+static std::atomic<unsigned> s_http_manager_inflight(0);


it would probably make more sense for this throttle to be per-endpoint rather than global, wouldn't it?

the problem that was reported here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/MGFY6Q5S7KC2BPHCKQCBM42DJ2O4WOH7/

was a coroutine stack overflow due to large number of open http requests.
this is not per endpoint, so, IMO, we should keep the counter global.

ok, so we want a global throttle to limit how much work we'll schedule at once. is there a reason this only applies to http endpoints though, and not others?

it exists in the kafka and amqp clients from the start :-)
https://github.com/ceph/ceph/blob/main/src/rgw/rgw_kafka.cc#L712

cbodley · 2025-05-21T15:26:36Z

+    const auto max_inflight = cct->_conf->rgw_http_notif_max_inflight;
+    if (max_inflight != 0 &&
+        s_http_manager_inflight >= cct->_conf->rgw_http_notif_max_inflight) {


s_http_manager_inflight >= max_inflight since you made a local variable for it

cbodley · 2025-05-21T15:28:13Z

  services:
  - rgw
  with_legacy: true
+- name: rgw_http_notif_message_timeout


can we document these options somewhere? either in a 'config' section in https://docs.ceph.com/en/latest/radosgw/notifications/ or a new 'bucket notification' section similar to https://docs.ceph.com/en/latest/radosgw/config-ref/#topic-persistency-settings

and note that when rendered that way, it only shows the long_desc

cbodley · 2025-05-22T14:51:22Z

+- ``rgw_http_notif_message_timeout``: This is the maximum time in seconds to deliver a notification.
+  Delivery error occurs when the message timeout is exceeded.
+  This value includes the connection time, and hence must be larger than the connection timeout.
+  If set to zero the http client will wait indefinitely. If not set the default will be 10 seconds.
+- ``rgw_http_notif_connection_timeout``: This is the maximum time in seconds to connect to the endpoint.
+  Delivery error occurs when the message timeout is exceeded.
+  If set to zero the defaut value of 300 seconds will be used. If not set the default will be 5 seconds.
+- ``rgw_http_notif_max_inflight``: This is the maximum number of messages in-flight (across all http endpoints).
+  Delivery error (BUSY) occurs when the number of messages is exceeded.
+  If set to zero there is not limit on the number of messages in-flight. If not set the default will be 8192.


we have a confval directive that makes this much easier. for example, doc/radosgw/config-ref.rst just has:

.. confval:: rgw_frontends .. confval:: rgw_data .. confval:: rgw_enable_apis

which renders to https://docs.ceph.com/en/latest/radosgw/config-ref/#ceph-object-gateway-config-reference

this avoids having to duplicate the text between docs and config options

cbodley · 2025-05-22T14:57:21Z

+        s_http_manager_inflight >= max_inflight) {
+      ldout(cct, 1) << "ERROR: send failed. http endpoint manager busy. in-flight requests: " <<
+        s_http_manager_inflight << " >= " << max_inflight << dendl;
+      return -EBUSY;


how do the upper levels handle EBUSY? process_entry() returns Failure back to process_queue(). will that just retry immediately?

if we keep returning EBUSY back to process_queue(), it will resend the cls_2pc_queue_list_entries() op every time and hammer the osd with reads. once we hit this limit, several process_queue() coroutines may end up doing this at once

i have a feeling that we'd be better off suspending the coroutine here until we're able to schedule this request. that's more complicated though

in case of persistent notifications we retry according to the retry configuration we have there.
we use: rgw_topic_persistency_sleep_duration to space out retries of a specific entry. but we are not looking into the result code.
we are treating all errors in the same way.

would be a nice enhancement to treat EBUSY differently. maybe use larger value for rgw_topic_persistency_sleep_duration?

in case of non-persistent notifications we reply EBUSY to the frontend, which probably convert that 503

looking at the use of rgw_topic_persistency_sleep_duration in https://github.com/ceph/ceph/blob/458d8862/src/rgw/driver/rados/rgw_notify.cc#L231-L246, that causes process_entry() to return Sleeping but doesn't cause process_queue() itself to sleep

the only sleep i see in process_queue() is for is_idle (https://github.com/ceph/ceph/blob/458d8862/src/rgw/driver/rados/rgw_notify.cc#L383-L389) which is false when the queue has entries to send/retry

so when process_entry() returns Sleeping, process_queue() will immediately loop back and resend cls_2pc_queue_list_entries(), then call process_entry() again with the same result. we don't want to keep sending those listing ops while we're throttled or waiting for retry

the sleep indication is per entry, and the queue can have many entries in different state.
i guess that one optimization is to check if all entries in the queue are in "sleep" state, we can put the queue processing into sleep state.

note, however, that this is a different issue. the RGW will be busy going through the queue, but will not send the notification to the HTTP server (which was causing the issue).

yuvalif · 2025-05-26T15:44:06Z

@cbodley can you please re-review?

yuvalif · 2025-05-27T04:22:43Z

notification tests are passing: https://pulpito.ceph.com/yuvalif-2025-05-27_03:24:05-rgw:notifications-wip-yuval-71390-distro-default-smithi/

this should allow for proper shutdown of the queue handling code of persistent notifications. Fixes: https://tracker.ceph.com/issues/71390 Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>

yuvalif · 2025-06-17T16:38:12Z

overwritten the branch by mistake.
will open 2 sepaarte PRs for the http timeout fix and the tokens waiter refactoring

yuvalif · 2025-06-23T10:41:12Z

replaced by: #64110

yuvalif requested a review from a team as a code owner May 21, 2025 15:05

github-actions Bot added common rgw tests labels May 21, 2025

cbodley reviewed May 21, 2025

View reviewed changes

yuvalif force-pushed the wip-yuval-71390 branch from 52529fc to 6786098 Compare May 22, 2025 10:55

yuvalif requested a review from a team as a code owner May 22, 2025 10:55

github-actions Bot added the documentation label May 22, 2025

cbodley reviewed May 22, 2025

View reviewed changes

yuvalif force-pushed the wip-yuval-71390 branch 2 times, most recently from dd9b95a to 4d8f27a Compare May 22, 2025 16:42

yuvalif requested a review from cbodley May 22, 2025 16:46

rgw/notifications: replace timer waiter with async waiter

63ab96f

this should allow for proper shutdown of the queue handling code of persistent notifications. Fixes: https://tracker.ceph.com/issues/71390 Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>

yuvalif force-pushed the wip-yuval-71390 branch from 4d8f27a to 63ab96f Compare June 17, 2025 16:34

yuvalif closed this Jun 17, 2025

yuvalif mentioned this pull request Jun 23, 2025

rgw/notifications: add http request timeout and max inflight #64110

Merged

14 tasks

Conversation

yuvalif commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuvalif commented May 26, 2025

Uh oh!

yuvalif commented May 27, 2025

Uh oh!

yuvalif commented Jun 17, 2025

Uh oh!

yuvalif commented Jun 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yuvalif commented May 21, 2025 •

edited

Loading