Skip to content

rgw: don't map to EIO in rgw_http_error_to_errno()#56704

Merged
cbodley merged 1 commit intoceph:mainfrom
cbodley:wip-rgw-default-http-error
Apr 10, 2024
Merged

rgw: don't map to EIO in rgw_http_error_to_errno()#56704
cbodley merged 1 commit intoceph:mainfrom
cbodley:wip-rgw-default-http-error

Conversation

@cbodley
Copy link
Contributor

@cbodley cbodley commented Apr 4, 2024

the http client uses EIO to detect connection errors specifically. if we map normal http errors to EIO, we incorrectly mark their endpoint as failed and route requests to other endpoints (if any exist)

default to ERR_INTERNAL_ERROR (500 InternalError) instead

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

the http client uses EIO to detect connection errors specifically. if we
map normal http errors to EIO, we incorrectly mark their endpoint as
failed and route requests to other endpoints (if any exist)

default to ERR_INTERNAL_ERROR (500 InternalError) instead

Signed-off-by: Casey Bodley <cbodley@redhat.com>
@cbodley
Copy link
Contributor Author

cbodley commented Apr 4, 2024

it's hard to tell whether anything in multisite really depends on that mapping to EIO. i see some retry-on-EIO loops (ex https://github.com/ceph/ceph/blob/becfb26/src/rgw/driver/rados/rgw_data_sync.cc#L285-L286 and https://github.com/ceph/ceph/blob/becfb26/src/rgw/driver/rados/rgw_sync.cc#L513-L514) but it looks like those are meant to deal with connection errors

@cbodley
Copy link
Contributor Author

cbodley commented Apr 9, 2024

@smanjara this was part of the same batch that passed qa with the fix from #56765

i scanned the three-zone teuthology.log and it looks like all of the normal checkpoint failures

@cbodley cbodley merged commit 17c90ba into ceph:main Apr 10, 2024
@cbodley
Copy link
Contributor Author

cbodley commented Apr 10, 2024

this doesn't have a tracker, so i'll include its squid backport with #56765

@cbodley cbodley deleted the wip-rgw-default-http-error branch April 10, 2024 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants