rgw/kafka: set message timeout to 5 seconds by yuvalif · Pull Request #55952 · ceph/ceph

yuvalif · 2024-03-05T10:21:12Z

also increase the idle timeout to 30 seconds.
test instructions:
https://gist.github.com/yuvalif/33487bff19883e3409caa8a843a0b353

Fixes: https://tracker.ceph.com/issues/64710

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e

kchheda3 · 2024-03-06T15:57:55Z

src/common/options/rgw.yaml.in

+  level: advanced
+  desc: This is the maximum time in milliseconds to deliver a message (including retries)
+  long_desc: Delivery error occurs when either the retry count or the message timeout are exceeded.
+    If set to zero, message will time out only based on retries.


but we are not setting retries here, so if someone sets to zero what will happen?
since we are allowing setting timeout, can we also set max_retry? message.send.max.retries & retry.backoff.ms

you are right, the retry default is too high.
will add as a conf parameter with a lower value.

the retry backoff is 100ms up to max of 1s. these values make sense. do you think we should expose these?

100ms seems reasonable , we could expose it, but keep the default value to same as kafka ?

looking at the conf of the librdkafka version that we use: https://github.com/confluentinc/librdkafka/blob/v0.11.6/CONFIGURATION.md

the default retries are 2.
the backoof is 100ms
there is no max backoff (not sure there is exponential backoff).
anyway, until we upgrade our library version, we don't need to add these

looking at the conf of the librdkafka version that we use: https://github.com/confluentinc/librdkafka/blob/v0.11.6/CONFIGURATION.md

the default retries are 2. the backoof is 100ms there is no max backoff (not sure there is exponential backoff). anyway, until we upgrade our library version, we don't need to add these

sorry, but why not add it ? i get its value is 2 max_retry but since we are doing the PR for TTL, we can add max_retry config as well, its only a single line change to add this config ? (purely for being a good citizen, since we add ttl, we also add the max_retry and let the default be same 2 as kafka) ?

yes, will add. they changed it to MAX_INT in v1.6.1 (the version in rhel/centos9)

only thing i'm not going to add is the max backoff - since it does not exist in the librdkafka versions we use

could not get the max retry conf value to work.
at least for librdkafka v1.6.1, when message timeout was set to zero, the message never expired, even if max retry and the backoff numbers were set to low values.
so, to prevent this case, timeout cannot be set to zro, and will be set to 1ms if the user set it to zero.

kchheda3 · 2024-03-06T16:00:47Z

src/common/options/rgw.yaml.in

    Note that the connection will not be considered idle, even if it is down,
    as long as there are attempts to send messages to it.
-  default: 30
+  default: 300


why are we changing this value ?

this should be used for garbage collection of the connections, and not for error handling (as it was done before).
30sec is too low for that purpose.

also increase the idle timeout to 30 seconds. test instructions: https://gist.github.com/yuvalif/33487bff19883e3409caa8a843a0b353 Fixes: https://tracker.ceph.com/issues/64710 Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>

kchheda3 · 2024-03-08T16:29:34Z

src/rgw/rgw_kafka.cc

+  // however, testing with librdkafka v1.6.1 did not expire the message in that case. hence, a value of zero is changed to 1ms
+  constexpr std::uint64_t min_message_timeout = 1;
+  const auto message_timeout = std::max(min_message_timeout, conn->cct->_conf->rgw_kafka_message_timeout);
+  if (rd_kafka_conf_set(conn->temp_conf, "message.timeout.ms", 


just think more on this, i was wondering do we want to do this only for sync-notification, coz for non-sync notification having larger timeout means if there is a transient kafka downtime which is more than 5 seconds but less than 5 minutes (kafka default), the persistent notification will just wait and succeed eventually vs retrying to send it ?
for sync notification latency is associated, so having 5 seconds there make sense

for persistenmt notifications a long timeout creates a different issue, as the coroutines of the notification manager will just pile up and consume significant amount of memory.
i think we did not see that as a huge problem so far, becasue of the 30 seconds idle timeout.
but if set to 5 minutes, it might create an issue.

Ohh I see, so then you want to keep the behaviour for persistent same (30 seconds) since we have not seen any issues and seems to working as expected ?
Keeping 5 seconds will now exhaust the retries faster, and we would writing entries back to queue more often then before.

Ohh I see, so then you want to keep the behaviour for persistent same (30 seconds) since we have not seen any issues and seems to working as expected ? Keeping 5 seconds will now exhaust the retries faster, and we would writing entries back to queue more often then before.

will have to test that to see the impact. but i agree that we dont want to see a RADOS spike when the broker is down.
an option would be to keep 2 connections per broker, one for persistent notifications (with 30sec message timeout), and one for sync notifications (with 5sec message timeout)?

Why complicate with 2 connections. Rather have a default with 30 seconds for persistent and then override that value for sync notification based on conf value

persistency is per topic. connection is per broker.
we don't create a connection per topic, so we would keep 2 connections per broker, each with different conf

ohh i see, so how we want to handle this ?
test first or just separate out the 2 connections ?

i tested it, and could not see any perf hit in cpu or disk.
looked at the rgw_notify.cc code, and it actually makes sense that there will be no impact on the kafka timeout:

when a persistent notification gets an error, it does not do anything to the queue (no RADOS operation). the notification is deleted from the queue only when it gets an ack

the "max retry" mechanism is memory based only. meaning, if we defined max retries, and get an error we update the count only in memory and not in the queue, so there is no RADOS operation here either

thanks for the verification, so the only side effect now is that retries will happen faster compared to earlier which was waiting for the response for 30 seconds before retrying.

yuvalif · 2024-03-11T13:52:59Z

jenkins test make check

yuvalif · 2024-03-11T13:53:19Z

jenkins test windows

yuvalif · 2024-03-11T13:53:36Z

jenkins test api

yuvalif · 2024-03-11T13:53:58Z

jenkins test docs

yuvalif · 2024-03-12T10:35:24Z

jenkins test make check

yuvalif · 2024-03-13T10:27:16Z

one failure in teuthology: https://pulpito.ceph.com/yuvalif-2024-03-13_06:01:17-rgw:notifications-wip-yuval-64710-distro-default-smithi/
which is a know issue: https://tracker.ceph.com/issues/63909

yuvalif requested a review from kchheda3 March 5, 2024 10:21

yuvalif requested a review from a team as a code owner March 5, 2024 10:21

github-actions bot added common rgw labels Mar 5, 2024

yuvalif added the needs-review label Mar 5, 2024

yuvalif requested a review from cbodley March 5, 2024 18:59

kchheda3 reviewed Mar 6, 2024

View reviewed changes

rgw/kafka: set message timeout to 5 seconds

1c13850

also increase the idle timeout to 30 seconds. test instructions: https://gist.github.com/yuvalif/33487bff19883e3409caa8a843a0b353 Fixes: https://tracker.ceph.com/issues/64710 Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>

yuvalif force-pushed the wip-yuval-64710 branch from 953c68c to 1c13850 Compare March 7, 2024 10:47

yuvalif mentioned this pull request Mar 7, 2024

rgw/kafka: reply with an error when broker is down #55051

Closed

14 tasks

kchheda3 reviewed Mar 8, 2024

View reviewed changes

yuvalif removed the needs-review label Mar 11, 2024

kchheda3 approved these changes Mar 12, 2024

View reviewed changes

yuvalif added needs-qa wip-yuval-testing labels Mar 12, 2024

yuvalif merged commit e08c812 into ceph:main Mar 13, 2024

This was referenced Mar 13, 2024

squid: rgw/kafka: set message timeout to 5 seconds #56156

Merged

reef: rgw/kafka: set message timeout to 5 seconds #56158

Merged

quincy: rgw/kafka: set message timeout to 5 seconds #56163

Merged

Conversation

yuvalif commented Mar 5, 2024

Checklist

Uh oh!

kchheda3 Mar 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kchheda3 Mar 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuvalif commented Mar 11, 2024

Uh oh!

yuvalif commented Mar 11, 2024

Uh oh!

yuvalif commented Mar 11, 2024

Uh oh!

yuvalif commented Mar 11, 2024

Uh oh!

yuvalif commented Mar 12, 2024

Uh oh!

yuvalif commented Mar 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kchheda3 Mar 6, 2024 •

edited

Loading

kchheda3 Mar 6, 2024 •

edited

Loading