rgw: cumulatively fix 6 AWS SigV4 request failure cases by mattbenjamin · Pull Request #54856 · ceph/ceph

mattbenjamin · 2023-12-11T01:57:23Z

These changes address checksum header identification and signing
algorithm selection, including checksum trailer verification
for signed- and unsigned-payload cases.

These changes address all the actual S3 request failures I have
so far been able to reproduce, with and without content checksums
and/or new trailing checksum headers, and with and without
SSL.

Fixes: https://tracker.ceph.com/issues/63153

Specifically, it fixes the request failures that motivated the
initial tracker filing. It extracts but does not validate new client content
checksums if present. Validation and management of new
S3 content-checksum headers will follow in a subsequent change.

squashed commits:

wip chunk meta parsing--seem to have first AWSv4ComplMulti::ChunkMeta::create_next sort of parsing
use constexpr sarlen(...) for static array lengths throughout rgw_auth_s3.cc
link AWSv4CompleMulti::ChunkMeta to its enclosing completer
capture original content-length header before AWSv4ComplMulti overwrites it
mostly extract the trailer
fix misordered content-length, experiment w/exbuf
save leftover bytes between calls to AWSv4ComplMulti::recv_chunk()
propagate data_offset_in_stream from AWSv4ComplMulti::recv_chunk()
clean up trailer section extract
trailer section cleanup and introduce extract_helper
unrolled checksum extract--fixup
integrate matching with types from rgw_cksum.h
fix sv_trailer end pos, and cleanup
add proplist interface to rgw::auth::Completer and AWSv4ComplMulti
spliterate trailers
check completer props
redefine prop_map to point into already-allocated trailer_vec
hax: thread a counter onto AWSv4ComplMulti recv_body() and recv_chunk path
fix apparent bug where due to reads less than chunk_size induce a final, zero-length read that was skipped before forcing recognition of the last chunk in the stream
check only for a trailing checksum named in x-amz-trailer
don't try to match signatures when no signature provided (because streaming unsigned)
oops, fix content_length decl
fix recognition of next chunk envelope in unsigned aws-chunk case
clean up AWSv4CompMulti flags and correctly detect aws unsigned chunked
rework checksum-trailer extraction and introduce AWSv4ComplMulti::calc_v4_trailing_signature
thread const struct req_state* into AWSv4ComplMulti
large cleanup of trailer parsing, no regression
fix trailer signature calculation--checks
correctly generate final chunk hmac
typo in comment
verify trailing signature when expected (using expected final chunk signature)
move trailer_vec back onto recv_body()'s stack
remove strange completer comment
remove last_frag (now points into parsing_buf)
remove implied dependency on content_length
move trailer recognition to AWSv4ComplMulti::complete()
remove now-unused is_last_chunk() predicate

Contribution Guidelines

To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e

cbodley

looking good, the parsing changes make sense

src/rgw/rgw_auth_s3.h

src/rgw/rgw_rest_s3.cc

src/rgw/rgw_auth_s3.cc

src/rgw/rgw_op.cc

src/rgw/rgw_auth_s3.cc

src/rgw/rgw_auth_s3.h

mattbenjamin · 2023-12-30T23:11:51Z

@cbodley this should be ready for re-review
junit test suite is here: git@github.com:linuxbox2/jcksum.git (52 tests)

mattbenjamin · 2023-12-31T16:31:43Z

jenkins test windows

mattbenjamin · 2023-12-31T16:39:37Z

something broke the windows build mid-morning. it's complaining about an invalid boost submodule hash. this PR doesn't touch boost.

src/rgw/rgw_auth_s3.cc

src/rgw/rgw_common.h

src/rgw/rgw_auth_s3.cc

cbodley · 2023-12-31T01:01:49Z

src/rgw/rgw_auth_s3.h

+  inline void put_prop(const std::string_view k, const std::string_view v) {
+    /* assume the caller will mangle the key name, if required */
+    auto& map = const_cast<env_map_t&>(s->info.env->get_map());


i tend to view const_cast as a code smell, since it's usually just a shortcut to avoid propagating const correctly. it's necessary here because a) we only get access to const req_state* in the constructor, and b) RGWEnv only exposes a const-qualified get_map()

since put_prop() is only called during AWSv4ComplMulti::complete() now, could we extend the Completer interface to take a non-const RGWEnv& env there so we can call env.set(k, v) instead?

that way we preserve the original intent for the Completer to have const-only access to the input req_state, while
the complete(RGWEnv& env) signature hopefully makes it clear that it may add trailing headers

I think the current dance around hiding things from Completer is kind of daft, to be honest. I'd prefer not to make further backflips here. Notice that the code I submitted does not widen Completer at all, only AWSv4ComplMulti, which has a reason to modify the req_state.

(at the same time, there's clearly no good justification for unwinding the the haskellish style of this entire subsystem--that said, what I'm trying to say is, yes, there's a clear inconsistency--but having a special call to modify the req_state is the code smell, imo.)

src/rgw/rgw_auth_s3.cc

src/rgw/rgw_auth_s3.h

src/rgw/rgw_auth_s3.cc

mattbenjamin · 2023-12-31T22:57:48Z

I added logic to throw an exception if we would exceed trailer_buf_size-1 bytes of trailer, using ERR_LIMIT_EXCEEDED as the error code

src/rgw/rgw_auth_s3.cc

src/rgw/rgw_auth_s3.h

mattbenjamin · 2024-01-04T13:01:17Z

jenkins test windows

github-actions · 2024-02-11T17:53:40Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

cbodley · 2024-02-11T18:49:36Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

apologies, i had forgotten about these conflicts when i merged #55250

These changes address checksum header identification and signing algorithm selection, including checksum trailer verification for signed- and unsigned-payload cases. These changes address all the actual S3 request failures I have so far been able to reproduce, with and without content checksums and/or new trailing checksum headers, and with and without SSL. Fixes: https://tracker.ceph.com/issues/63153 Specifically, it fixes the request failures that motivated the initial tracker filing. It extracts but does not validate new client content checksums if present. Validation and management of new S3 content-checksum headers will follow in a subsequent change. Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> squashed commits: * wip chunk meta parsing--seem to have first AWSv4ComplMulti::ChunkMeta::create_next sort of parsing * use constexpr sarlen(...) for static array lengths throughout rgw_auth_s3.cc * link AWSv4CompleMulti::ChunkMeta to its enclosing completer * capture original content-length header before AWSv4ComplMulti overwrites it * mostly extract the trailer * fix misordered content-length, experiment w/exbuf * save leftover bytes between calls to AWSv4ComplMulti::recv_chunk() * propagate data_offset_in_stream from AWSv4ComplMulti::recv_chunk() * clean up trailer section extract * trailer section cleanup and introduce extract_helper * unrolled checksum extract--fixup * fix sv_trailer end pos, and cleanup * add proplist interface to rgw::auth::Completer and AWSv4ComplMulti * spliterate trailers * check completer props * redefine prop_map to point into already-allocated trailer_vec * hax: thread a counter onto AWSv4ComplMulti recv_body() and recv_chunk path * fix apparent bug where due to reads less than chunk_size induce a final, zero-length read that was skipped before forcing recognition of the last chunk in the stream * check only for a trailing checksum named in x-amz-trailer * don't try to match signatures when no signature provided (because streaming unsigned) * oops, fix content_length decl * fix recognition of next chunk envelope in unsigned aws-chunk case * clean up AWSv4CompMulti flags and correctly detect aws unsigned chunked * rework checksum-trailer extraction and introduce AWSv4ComplMulti::calc_v4_trailing_signature * thread const struct req_state* into AWSv4ComplMulti * large cleanup of trailer parsing, no regression * fix trailer signature calculation--checks * correctly generate final chunk hmac * typo in comment * verify trailing signature when expected (using expected final chunk signature) * move trailer_vec back onto recv_body()'s stack * remove strange completer comment * remove last_frag (now points into parsing_buf) * remove implied dependency on content_length * move trailer recognition to AWSv4ComplMulti::complete() * remove now-unused is_last_chunk() predicate * remove unused ChunkMeta::completer * responses to review comments * when trailer is sig expected, fail (only) if none present or if it does not match calculated * remove stale parse_content_length(...) decl * remove now-unused AWSv4ComplMulti::content_length * fix extract_helper end search position as in mut_extract_helper * change "\n" reserve term in get_canon_amz_hdrs() part of the sum (review) and initialize length to 0 * remove debugging code Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>

This can be run by hand with: ./mvnw clean package ./mvnw test -Dtest=PutObjects The following properties are sourced from the environment: AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY RGW_HTTP_ENDPOINT_URL RGW_HTTPS_ENDPOINT_URL Then adds: qa/workunits/rgw: add test driver script for maven suite Launch it fromn cls.yaml, as with test_librgw_file.h. Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>

1. correctly match signature of 0-length chunk 2. initialize lf_bytes Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>

The junit5 suite in fact chooses selects transport security (SSL) strictly from the endpoint URL. The test_awssdkv4_sig.sh (or its caller?) only needs to export RGW_HTTP_ENDPOINT_URL appropriately to get one or the other. Fix several mistakes in refactoring caught by Ali Maredia. Print AccessKey, SecretKey and EndpointURL on startup Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>

Signed-off-by: Ali Maredia <amaredia@redhat.com>

cbodley

looks great if it's passing tests

cbodley · 2024-02-15T13:51:39Z

The following tests FAILED:
235 - unittest_posix_bucket_cache (Failed)

cbodley · 2024-02-15T13:51:43Z

jenkins test make check

cbodley · 2024-02-15T15:46:16Z

The following tests FAILED:
34 - run-rbd-unit-tests-127.sh (Timeout)
198 - unittest_osdmap (Subprocess aborted)

cbodley · 2024-02-15T15:46:21Z

jenkins test make check

cbodley · 2024-02-16T13:52:13Z

passed qa in https://pulpito.ceph.com/cbodley-2024-02-15_03:24:02-rgw-wip-accept-new-awssigv4-distro-default-smithi/

verified that java test are running in teuthology.log:

Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.619 s -- in io.ceph.jcksum.PutObjects

WGH- · 2024-07-09T12:05:11Z

For cross-reference, this likely fixes an issue in some Go S3 client: minio/minio-go#1235. That one is about 0-byte chunk-signature for empty object.

mattbenjamin added bug-fix rgw labels Dec 11, 2023

mattbenjamin self-assigned this Dec 11, 2023

mattbenjamin requested a review from a team as a code owner December 11, 2023 01:57

mattbenjamin requested a review from mmgaggle December 11, 2023 01:57

mattbenjamin added the DNM label Dec 11, 2023

cbodley reviewed Dec 12, 2023

View reviewed changes

mattbenjamin force-pushed the wip-accept-new-awssigv4 branch from a3b1509 to 987a64c Compare December 21, 2023 20:51

mattbenjamin force-pushed the wip-accept-new-awssigv4 branch 2 times, most recently from f42fe45 to fd143ea Compare December 30, 2023 23:10

mattbenjamin removed the DNM label Dec 30, 2023

mattbenjamin force-pushed the wip-accept-new-awssigv4 branch from fd143ea to f838970 Compare December 30, 2023 23:36

mattbenjamin requested review from TejasC88 and cbodley December 30, 2023 23:36

mattbenjamin force-pushed the wip-accept-new-awssigv4 branch 2 times, most recently from 7433507 to ad18207 Compare December 31, 2023 15:31

mattbenjamin force-pushed the wip-accept-new-awssigv4 branch from ad18207 to b35b925 Compare December 31, 2023 16:50

cbodley reviewed Dec 31, 2023

View reviewed changes

src/rgw/rgw_auth_s3.h Outdated Show resolved Hide resolved

src/rgw/rgw_auth_s3.cc Show resolved Hide resolved

mattbenjamin force-pushed the wip-accept-new-awssigv4 branch from b35b925 to 2c0539c Compare December 31, 2023 20:16

mattbenjamin mentioned this pull request Jan 1, 2024

rgw: add s3 checksum crc32 and sha1 #49986

Closed

14 tasks

cbodley reviewed Jan 2, 2024

View reviewed changes

src/rgw/rgw_auth_s3.cc Outdated Show resolved Hide resolved

src/rgw/rgw_auth_s3.cc Outdated Show resolved Hide resolved

src/rgw/rgw_auth_s3.cc Show resolved Hide resolved

src/rgw/rgw_auth_s3.h Outdated Show resolved Hide resolved

mattbenjamin force-pushed the wip-accept-new-awssigv4 branch from 2c0539c to abc0f2c Compare January 4, 2024 13:34

github-actions bot added documentation tests labels Jan 5, 2024

github-actions bot added the needs-rebase label Feb 11, 2024

mattbenjamin added 5 commits February 13, 2024 13:28

awssigv4: fix signing of 0-byte object

dc30637

1. correctly match signature of 0-length chunk 2. initialize lf_bytes Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>

awssigv4: fix recognition of trailer boundary when no trailing signature

d374a21

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>

cbodley force-pushed the wip-accept-new-awssigv4 branch from f5b053c to 6f1fde8 Compare February 13, 2024 18:49

github-actions bot removed the needs-rebase label Feb 13, 2024

cbodley mentioned this pull request Feb 14, 2024

rgw: implement S3 additional checksum support #55076

Merged

14 tasks

mattbenjamin and others added 3 commits February 14, 2024 12:02

rgw_sigv4: catch exceptions on client builder

88ab55e

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>

qa: enable test_awssdkv4_sig workunit to run in teuthology

639ea51

Signed-off-by: Ali Maredia <amaredia@redhat.com>

qa: move cert import into rgw task

ccbb1f5

Signed-off-by: Ali Maredia <amaredia@redhat.com>

mattbenjamin force-pushed the wip-accept-new-awssigv4 branch from 6f1fde8 to ccbb1f5 Compare February 14, 2024 21:07

cbodley added needs-qa wip-cbodley-testing labels Feb 14, 2024

cbodley approved these changes Feb 14, 2024

View reviewed changes

yuriw added the wip-yuri2-testing label Feb 15, 2024

cbodley merged commit bf3a294 into ceph:main Feb 16, 2024

frittentheke mentioned this pull request Mar 7, 2024

S3 backend on nonAWS implementation (OCI) not working in 1.6.3 hashicorp/terraform#34053

Closed

cbodley mentioned this pull request Jul 4, 2024

reef: rgw: cumulatively fix 6 AWS SigV4 request failure cases #58435

Merged

This was referenced Sep 3, 2024

rgw: improve buffer list utilization in the chunkupload scenario #53266

Merged

rgw: fix AWS SigV4 request failure when uploading empty file #59574

Closed

Conversation

mattbenjamin commented Dec 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Contribution Guidelines

Checklist

Uh oh!

cbodley left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattbenjamin commented Dec 30, 2023

Uh oh!

mattbenjamin commented Dec 31, 2023

Uh oh!

mattbenjamin commented Dec 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cbodley Dec 31, 2023

Choose a reason for hiding this comment

Uh oh!

mattbenjamin Dec 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattbenjamin Dec 31, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattbenjamin commented Dec 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattbenjamin commented Jan 4, 2024

Uh oh!

github-actions bot commented Feb 11, 2024

Uh oh!

cbodley commented Feb 11, 2024

Uh oh!

cbodley left a comment

Choose a reason for hiding this comment

Uh oh!

cbodley commented Feb 15, 2024

Uh oh!

cbodley commented Feb 15, 2024

Uh oh!

cbodley commented Feb 15, 2024

Uh oh!

cbodley commented Feb 15, 2024

Uh oh!

cbodley commented Feb 16, 2024

Uh oh!

WGH- commented Jul 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

mattbenjamin commented Dec 11, 2023 •

edited

Loading

mattbenjamin commented Dec 31, 2023 •

edited

Loading

mattbenjamin Dec 31, 2023 •

edited

Loading

mattbenjamin commented Dec 31, 2023 •

edited

Loading