crimson/net: set TCP_NODELAY according to ms_tcp_nodelay#52592
crimson/net: set TCP_NODELAY according to ms_tcp_nodelay#52592
Conversation
|
@xxhdx1985126 Hi Xuehan, thank you for rootcausing the latency issue. But, IMHO, disabling the Nagel’s algorithm is not a pure win. We have shorter latency in expense of sending more smaller segments. Have you tried to reduce tcp_delack_min? Also, the commit message could use some more explanation on the reasoning. Performance issue is a complicated thing. |
Will try:-) |
@tchaikov Hi, kefu, it seems that the kernel doesn't have the "TCP_DELACK_MIN" feature enabled by default when compiling.
I've added explanation in the commit message, please take a look again:-) |
There are cases in which we need low latency of an individual op, like updating RBD image's object map which would block other ops for a object not created yet. We need to make sure the rbd-object-map-update like ops have as low latency as possible Fixes: https://tracker.ceph.com/issues/62098 Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
cyx1231st
left a comment
There was a problem hiding this comment.
Looks another possible option is to set TCP_QUICKACK without setting TCP_NODELAY. This will keep small packets buffering.
LGTM to integrate the option.
Just did a test, it seems that TCP_QUICKACK didn't help, trying to see why. |
thank you. seems it is only available on some linux branches, and the change which makes TCP_DELACK_MIN a knob has not been included by any of linux upstream branches. i spotted it on https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_for_real_time/7/html/tuning_guide/reducing_the_tcp_delayed_ack_timeout . guess RHEL for RT applied this patch.
thanks. Line 43 in 8c4d93c EDIT, we just respect the setting in this change. we are not changing the default setting at all -- it is true by default already. |
|
@athanatos @Matan-B This looks to be a minor change, can we merge it directly, or batch with other ready crimson PRs for convenience (is there a label for this case) ? |
I already scheduled a test run with this PR. Generally it looks good to me. I spotted the one failure which to me appears unrelated (AFAIK, we already experienced this issue). See https://tracker.ceph.com/issues/62162 btw, (not a reason to block this PR) but can the change be re-written to |
|
Agree, looks unrelated. |
Fixes: https://tracker.ceph.com/issues/62098
Signed-off-by: Xuehan Xu xuxuehan@qianxin.com
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows