Skip to content

rgw/kafka: set message timeout to 5 seconds#55952

Merged
yuvalif merged 1 commit intoceph:mainfrom
yuvalif:wip-yuval-64710
Mar 13, 2024
Merged

rgw/kafka: set message timeout to 5 seconds#55952
yuvalif merged 1 commit intoceph:mainfrom
yuvalif:wip-yuval-64710

Conversation

@yuvalif
Copy link
Contributor

@yuvalif yuvalif commented Mar 5, 2024

also increase the idle timeout to 30 seconds.
test instructions:
https://gist.github.com/yuvalif/33487bff19883e3409caa8a843a0b353

Fixes: https://tracker.ceph.com/issues/64710

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@yuvalif yuvalif requested a review from kchheda3 March 5, 2024 10:21
@yuvalif yuvalif requested a review from a team as a code owner March 5, 2024 10:21
@yuvalif yuvalif requested a review from cbodley March 5, 2024 18:59
level: advanced
desc: This is the maximum time in milliseconds to deliver a message (including retries)
long_desc: Delivery error occurs when either the retry count or the message timeout are exceeded.
If set to zero, message will time out only based on retries.
Copy link
Contributor

@kchheda3 kchheda3 Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we are not setting retries here, so if someone sets to zero what will happen?
since we are allowing setting timeout, can we also set max_retry? message.send.max.retries & retry.backoff.ms

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, the retry default is too high.
will add as a conf parameter with a lower value.

the retry backoff is 100ms up to max of 1s. these values make sense. do you think we should expose these?

Copy link
Contributor

@kchheda3 kchheda3 Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100ms seems reasonable , we could expose it, but keep the default value to same as kafka ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the conf of the librdkafka version that we use: https://github.com/confluentinc/librdkafka/blob/v0.11.6/CONFIGURATION.md

the default retries are 2.
the backoof is 100ms
there is no max backoff (not sure there is exponential backoff).
anyway, until we upgrade our library version, we don't need to add these

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the conf of the librdkafka version that we use: https://github.com/confluentinc/librdkafka/blob/v0.11.6/CONFIGURATION.md

the default retries are 2. the backoof is 100ms there is no max backoff (not sure there is exponential backoff). anyway, until we upgrade our library version, we don't need to add these

sorry, but why not add it ? i get its value is 2 max_retry but since we are doing the PR for TTL, we can add max_retry config as well, its only a single line change to add this config ? (purely for being a good citizen, since we add ttl, we also add the max_retry and let the default be same 2 as kafka) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will add. they changed it to MAX_INT in v1.6.1 (the version in rhel/centos9)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only thing i'm not going to add is the max backoff - since it does not exist in the librdkafka versions we use

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could not get the max retry conf value to work.
at least for librdkafka v1.6.1, when message timeout was set to zero, the message never expired, even if max retry and the backoff numbers were set to low values.
so, to prevent this case, timeout cannot be set to zro, and will be set to 1ms if the user set it to zero.

Note that the connection will not be considered idle, even if it is down,
as long as there are attempts to send messages to it.
default: 30
default: 300
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we changing this value ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be used for garbage collection of the connections, and not for error handling (as it was done before).
30sec is too low for that purpose.

also increase the idle timeout to 30 seconds.
test instructions:
https://gist.github.com/yuvalif/33487bff19883e3409caa8a843a0b353

Fixes: https://tracker.ceph.com/issues/64710

Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
// however, testing with librdkafka v1.6.1 did not expire the message in that case. hence, a value of zero is changed to 1ms
constexpr std::uint64_t min_message_timeout = 1;
const auto message_timeout = std::max(min_message_timeout, conn->cct->_conf->rgw_kafka_message_timeout);
if (rd_kafka_conf_set(conn->temp_conf, "message.timeout.ms",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just think more on this, i was wondering do we want to do this only for sync-notification, coz for non-sync notification having larger timeout means if there is a transient kafka downtime which is more than 5 seconds but less than 5 minutes (kafka default), the persistent notification will just wait and succeed eventually vs retrying to send it ?
for sync notification latency is associated, so having 5 seconds there make sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for persistenmt notifications a long timeout creates a different issue, as the coroutines of the notification manager will just pile up and consume significant amount of memory.
i think we did not see that as a huge problem so far, becasue of the 30 seconds idle timeout.
but if set to 5 minutes, it might create an issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh I see, so then you want to keep the behaviour for persistent same (30 seconds) since we have not seen any issues and seems to working as expected ?
Keeping 5 seconds will now exhaust the retries faster, and we would writing entries back to queue more often then before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh I see, so then you want to keep the behaviour for persistent same (30 seconds) since we have not seen any issues and seems to working as expected ? Keeping 5 seconds will now exhaust the retries faster, and we would writing entries back to queue more often then before.

will have to test that to see the impact. but i agree that we dont want to see a RADOS spike when the broker is down.
an option would be to keep 2 connections per broker, one for persistent notifications (with 30sec message timeout), and one for sync notifications (with 5sec message timeout)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why complicate with 2 connections. Rather have a default with 30 seconds for persistent and then override that value for sync notification based on conf value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

persistency is per topic. connection is per broker.
we don't create a connection per topic, so we would keep 2 connections per broker, each with different conf

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh i see, so how we want to handle this ?
test first or just separate out the 2 connections ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i tested it, and could not see any perf hit in cpu or disk.
looked at the rgw_notify.cc code, and it actually makes sense that there will be no impact on the kafka timeout:

  • when a persistent notification gets an error, it does not do anything to the queue (no RADOS operation). the notification is deleted from the queue only when it gets an ack
  • the "max retry" mechanism is memory based only. meaning, if we defined max retries, and get an error we update the count only in memory and not in the queue, so there is no RADOS operation here either

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the verification, so the only side effect now is that retries will happen faster compared to earlier which was waiting for the response for 30 seconds before retrying.

@yuvalif
Copy link
Contributor Author

yuvalif commented Mar 11, 2024

jenkins test make check

@yuvalif
Copy link
Contributor Author

yuvalif commented Mar 11, 2024

jenkins test windows

@yuvalif
Copy link
Contributor Author

yuvalif commented Mar 11, 2024

jenkins test api

@yuvalif
Copy link
Contributor Author

yuvalif commented Mar 11, 2024

jenkins test docs

@yuvalif
Copy link
Contributor Author

yuvalif commented Mar 12, 2024

jenkins test make check

@yuvalif
Copy link
Contributor Author

yuvalif commented Mar 13, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants