Bug #64832
openvalgrind UninitCondition in RGWSelectObj_ObjStore_S3::run_s3select_on_csv
0%
Description
from rgw/verify job: http://qa-proxy.ceph.com/teuthology/cbodley-2024-03-10_14:49:11-rgw-wip-cbodley-testing-distro-default-smithi/7589454/teuthology.log
valgrind log: http://qa-proxy.ceph.com/teuthology/cbodley-2024-03-10_14:49:11-rgw-wip-cbodley-testing-distro-default-smithi/7589454/remote/smithi154/log/valgrind/ceph.client.0.log.gz
<error>
<unique>0xacb11</unique>
<tid>298</tid>
<threadname>io_context_pool</threadname>
<kind>UninitCondition</kind>
<what>Conditional jump or move depends on uninitialised value(s)</what>
<stack>
<frame>
<ip>0x8F4C10</ip>
<obj>/usr/bin/radosgw</obj>
<fn>RGWSelectObj_ObjStore_S3::run_s3select_on_csv(char const*, char const*, unsigned long)</fn>
</frame>
<frame>
<ip>0x8F76E1</ip>
<obj>/usr/bin/radosgw</obj>
<fn>RGWSelectObj_ObjStore_S3::csv_processing(ceph::buffer::v15_2_0::list&, long, long)</fn>
</frame>
<frame>
<ip>0x7B3BD6</ip>
<obj>/usr/bin/radosgw</obj>
<fn>RGWGetObj_Decompress::handle_data(ceph::buffer::v15_2_0::list&, long, long)</fn>
</frame>
<frame>
<ip>0x9D47E7</ip>
<obj>/usr/bin/radosgw</obj>
<fn>get_obj_data::flush(rgw::OwningList<rgw::AioResultEntry>&&)</fn>
</frame>
<frame>
<ip>0x9E552A</ip>
<obj>/usr/bin/radosgw</obj>
<fn>RGWRados::Object::Read::iterate(DoutPrefixProvider const*, long, long, RGWGetDataCB*, optional_yield)</fn>
</frame>
<frame>
<ip>0x7FD455</ip>
<obj>/usr/bin/radosgw</obj>
<fn>RGWGetObj::execute(optional_yield)</fn>
</frame>
<frame>
<ip>0x8F9E20</ip>
<obj>/usr/bin/radosgw</obj>
<fn>RGWSelectObj_ObjStore_S3::execute(optional_yield)</fn>
</frame>
<frame>
<ip>0x6BC4CB</ip>
<obj>/usr/bin/radosgw</obj>
<fn>rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, optional_yield, rgw::sal::Driver*, bool)</fn>
</frame>
<frame>
<ip>0x6C054D</ip>
<obj>/usr/bin/radosgw</obj>
<fn>process_request(RGWProcessEnv const&, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, RGWRestfulIO*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*, int*)</fn>
</frame>
<frame>
<ip>0x5E60F0</ip>
<obj>/usr/bin/radosgw</obj>
</frame>
<frame>
<ip>0x62757F</ip>
<obj>/usr/bin/radosgw</obj>
</frame>
<frame>
<ip>0x12591A6</ip>
<obj>/usr/bin/radosgw</obj>
<fn>make_fcontext</fn>
</frame>
</stack>
</error>
Files
Updated by Casey Bodley almost 2 years ago
- Status changed from New to Can't reproduce
Updated by Casey Bodley about 1 year ago
- Status changed from Can't reproduce to New
valgrind error: UninitCondition
RGWSelectObj_ObjStore_S3::run_s3select_on_csv(char const*, char const*, unsigned long)
RGWSelectObj_ObjStore_S3::csv_processing(ceph::buffer::v15_2_0::list&, long, long)
RGWGetObj_Decompress::handle_data(ceph::buffer::v15_2_0::list&, long, long)
Updated by J. Eric Ivancich about 1 year ago
Just wanted to make sure you aware of the status change of this one, Gal.
Updated by Gal Salomon about 1 year ago · Edited
J. Eric Ivancich wrote in #note-5:
ping @Gal Salomon
Hi @J. Eric Ivancich
i had tried to re-produce that,i'm not getting that stack.
do you have an updated valgrind-log?
Updated by Casey Bodley about 1 year ago
Gal Salomon wrote in #note-6:
J. Eric Ivancich wrote in #note-5:
ping @Gal Salomon
Hi @J. Eric Ivancich
i had tried to re-produce that,i'm not getting that stack.do you have an updated valgrind-log?
Updated by Casey Bodley about 1 year ago
happened again in https://qa-proxy.ceph.com/teuthology/cbodley-2025-02-25_20:06:49-rgw-wip-rgw-createbucket-layout-distro-default-smithi/8152202/teuthology.log
valrind log: https://qa-proxy.ceph.com/teuthology/cbodley-2025-02-25_20:06:49-rgw-wip-rgw-createbucket-layout-distro-default-smithi/8152202/remote/smithi072/log/valgrind/ceph.client.0.log.gz
Updated by J. Eric Ivancich about 1 year ago
ping @Gal Salomon Just want to make sure you saw Casey's new comment above.
Updated by Gal Salomon about 1 year ago
@J. Eric Ivancich
yeah, i saw that (its jibrish)
from teuthology.log i can see its same stack as before.
Thanks.
Updated by Casey Bodley about 1 year ago
just saw this in 2 jobs of https://pulpito.ceph.com/cbodley-2025-02-28_00:57:03-rgw-wip-70013-distro-default-smithi/
Gal Salomon wrote in #note-10:
yeah, i saw that (its jibrish)
the web server is misconfigured somehow, so most logs get double-gzipped
Updated by Gal Salomon 12 months ago
i did the following
-- the conditional-jump happens upon decompress flow
-- i boot RGW with compression mode (4 different options zstd,snappy,lz4,zlib)
-- i uploaded big objects,small object, zero-size object.
-- run s3select expression on these objects
-- so far, no re-production of anything close to that
it may relate to runtime-environment / build-env(?)
investigation further more.
Updated by Gal Salomon 11 months ago
in order to investigate the valgrind detection, a code that simulate uninitialized-value added to rgw/s3select.
not clear, why it was not detected upon teuthology-run (it is detected on a local run)
https://github.com/ceph/ceph/pull/62689/files
checking the issue.
Updated by J. Eric Ivancich 10 months ago
- Backport changed from squid to squid tentacle
Updated by Casey Bodley 10 months ago
valgrind error: UninitCondition
RGWSelectObj_ObjStore_S3::run_s3select_on_csv(char const*, char const*, unsigned long)
RGWSelectObj_ObjStore_S3::csv_processing(ceph::buffer::v15_2_0::list&, long, long)
RGWGetObj_Decompress::handle_data(ceph::buffer::v15_2_0::list&, long, long)
Updated by J. Eric Ivancich 8 months ago
This was just reproduced on tentacle here:
Updated by Casey Bodley 8 months ago
also on tentacle: https://qa-proxy.ceph.com/teuthology/benhanokh-2025-07-01_16:00:03-rgw-dedup_backport-distro-default-smithi/8363950/teuthology.log
valgrind exception message: valgrind error: UninitCondition RGWSelectObj_ObjStore_S3::run_s3select_on_csv(char const*, char const*, unsigned long) RGWSelectObj_ObjStore_S3::csv_processing(ceph::buffer::v15_2_0::list&, long, long) RGWGetObj_Decompress::handle_data(ceph::buffer::v15_2_0::list&, long, long)
Updated by J. Eric Ivancich 8 months ago
Just hit this again:
failure_reason: 'valgrind error: UninitCondition RGWSelectObj_ObjStore_S3::run_s3select_on_csv(char const*, char const*, unsigned long) RGWSelectObj_ObjStore_S3::csv_processing(ceph::buffer::v15_2_0::list&, long, long) RGWGetObj_Decompress::handle_data(ceph::buffer::v15_2_0::list&, long, long)'
Updated by J. Eric Ivancich 8 months ago
In an email exchange with @Gal Salomon he wrote:
this issue is "spinning", it is not appears constantly, make it difficult to fix.
(it appeared in some of my other PR)even when I tried to make it appear (I created Uninit Condition)
it was not detected.it may relate to compiler flag options, so I will try to re-produce that.
Updated by J. Eric Ivancich 5 months ago
Hey @Gal Salomon , are there any updates on this?
Updated by Gal Salomon 5 months ago
J. Eric Ivancich wrote in #note-20:
Hey @Gal Salomon , are there any updates on this?
Hi Eric,
yes, I have updates, I created a centos and ubuntu versions, and ran s3test on both distro's.
The valgrind did not detect the same report.
i'm not sure why this issue does not reproduce, maybe it is missing a runtime factor, such as additional tests (which exist in teuthology and not in my tests)
I will change / add tests.
Updated by Gal Salomon 5 months ago
the attached valgrind log, was produced upon using valgrind on RGW running the S3-tests(Centos-stream), the log contains only ("Conditional jump .." reports), it catch errors in other stack, not the s3select stack.
Updated by Gal Salomon 5 months ago
Gal Salomon wrote in #note-22:
the attached valgrind log, was produced upon using valgrind on RGW running the S3-tests(Centos-stream), the log contains only ("Conditional jump .." reports), it catch errors in other stack, not the s3select stack.
attaching additional valgrind report, produce on ubuntu.
Updated by J. Eric Ivancich 5 months ago
Hi @Gal Salomon : This run from October 13 shows valgrind errors in which s3select calls are on the call stack.
Take a look at the valgrind errors here:
https://qa-proxy.ceph.com/teuthology/teuthology-2025-10-10_20:40:24-rgw-main-distro-default-smithi/8545777/remote/smithi149/log/valgrind/
Here's the log file:
https://qa-proxy.ceph.com/teuthology/teuthology-2025-10-10_20:40:24-rgw-main-distro-default-smithi/8545777/teuthology.log
Here's the full run:
https://pulpito.ceph.com/teuthology-2025-10-10_20:40:24-rgw-main-distro-default-smithi/
Updated by J. Eric Ivancich 5 months ago
Here's another one:
valgrind errors:
https://qa-proxy.ceph.com/teuthology/teuthology-2025-10-10_20:40:24-rgw-main-distro-default-smithi/8545757/remote/smithi016/log/valgrind/
same run as the previous comment
Updated by Gal Salomon 5 months ago
J. Eric Ivancich wrote in #note-25:
Here's another one:
valgrind errors:
https://qa-proxy.ceph.com/teuthology/teuthology-2025-10-10_20:40:24-rgw-main-distro-default-smithi/8545757/remote/smithi016/log/valgrind/same run as the previous comment
its contains new reports. on s3select stack.
Updated by Gal Salomon 4 months ago
@J. Eric Ivancich
the recent valgrind report (https://qa-proxy.ceph.com/teuthology/teuthology-2025-10-10_20:40:24-rgw-main-distro-default-smithi/8545757/remote/smithi016/log/valgrind/)
contain alot of new reports, such as InvalidRead , and different s3select-call-stack.
for one example the push_*::builder functions which are called upon parsing the SQL statement. not while processing the input.
the engine-code did not change for sometime, and the same with the test-code.
what is triggering these new reports? is there a change in runtime environment?
Updated by Casey Bodley 4 months ago
- Status changed from New to Fix Under Review
Updated by Upkeep Bot 4 months ago
- Status changed from Fix Under Review to Pending Backport
- Merge Commit set to a46507ac4b9fcfc5cee038da0129470fee0bd545
- Fixed In set to v20.3.0-4084-ga46507ac4b
- Upkeep Timestamp set to 2025-11-11T08:58:37+00:00
Updated by Upkeep Bot 4 months ago
- Copied to Backport #73786: squid: valgrind UninitCondition in RGWSelectObj_ObjStore_S3::run_s3select_on_csv added
Updated by Upkeep Bot 4 months ago
- Copied to Backport #73787: tentacle: valgrind UninitCondition in RGWSelectObj_ObjStore_S3::run_s3select_on_csv added