rgw/s3select : fix for error flow. add an option to disable s3select-request.#56834
rgw/s3select : fix for error flow. add an option to disable s3select-request.#56834
Conversation
e4ffbf0 to
ce23c37
Compare
|
https://pulpito.ceph.com/gsalomon-2024-04-19_04:12:34-rgw:verify-fix_error_message_flow-distro-default-smithi/ |
5bfac0a to
ce55750
Compare
| s3tests: | ||
| force-branch: ceph-master | ||
| force-branch: fix_non_handled_error_resonse | ||
| git_remote: https://github.com/galsalomon66/ |
There was a problem hiding this comment.
these lines already removed (back to ceph-master)
| services: | ||
| - rgw | ||
| with_legacy: true | ||
| - name: rgw_s3select_disable |
There was a problem hiding this comment.
suggest rgw_disable_s3select
| - name: rgw_s3select_disable | ||
| type: bool | ||
| level: advanced | ||
| desc: disable the s3select operation |
There was a problem hiding this comment.
describe what will happen when disabled, e.g., will we return an error code such as invalid request?
| void aws_response_handler::send_error_response_rgw_formatter(const char* error_code, | ||
| const char* error_message, | ||
| const char* resource_id) | ||
| std::string empty_error="--"; |
There was a problem hiding this comment.
the diff is maybe confusing me, but is this a global variable? I think you can use a static constexpr std::string for this
There was a problem hiding this comment.
I missed that.
will fix that to static/constexpr
|
bbc9a49 to
91a60c7
Compare
adding @mattbenjamin the "kill switch" could be "too dramatic" |
in some cases the error message does not return to client, connection got broken (invalid chunk length) fix another broken connection all data-source to use same API for sending error-response add the option rgw_s3select_disable(boolean). upon turning-on this option, it rejects s3select-requests with an error-message editorial. rollback to ceph-master. the ceph/s3-tests#561 must be merged with ceph-PR Signed-off-by: Gal Salomon <gal.salomon@gmail.com>
91a60c7 to
53ad57c
Compare
|
i very much agree with @yuvalif's sentiments above admins need a way to control rgw's total memory usage so it doesn't keep crashing with OOM. rgw will process up to rgw_max_concurrent_requests (1024) at a time. each Get/PutObject request will buffer up to rgw_get_obj_window_size/rgw_put_obj_min_window_size (16M) of data. so the admin can tune these knobs to guarantee that a server saturated with requests will still fit in available memory. we need a way to fit SELECT requests into this model too so how is it that a single SELECT request can kill the server with OOM? what data is it that's not bounded in size? is it state from the parquet parser? is it buffered results that we haven't written back to the client yet? that said, users have requested a way to disable s3select (like in https://tracker.ceph.com/issues/61821) so i'm glad to see this. i just think there's more work to do here |
|
yes, row groups divide the object, enabling skipping this basic building block. this 12GB object contains 6 groups, it is definitely not efficient and also caused the OOM. the solution should be by fetching batches(pre-defined size) of that row group whatever its size, and not all in one batch, as currently done. thus, filtering out big-size Parquet objects is not a solution. on a side note, in the case of Parquet, the SQL engine(actually, the Parquet reader) issues range-requests according to meta-data, the combination of the SQL statement and data-profile could sometimes lead to behavior that was not anticipated in advance. |
rgw/s3select : fix for error flow. add an option to disable s3select-request. Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
-- add the option to reject s3select-requests
ceph config set client.rgw.8000 rgw_s3select_disable true-- in some cases the error message does not return to client, connection got broken (invalid chunk length)
https://bugzilla.redhat.com/show_bug.cgi?id=2252396
https://bugzilla.redhat.com/show_bug.cgi?id=2275323
ceph/s3-tests#561 NOTE: the s3-tests PR must merged with this PR
https://tracker.ceph.com/issues/65828
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e