Skip to content

rgw/rgw_lua_utils: fix memory leak in luaL_error() formatting#63819

Merged
tchaikov merged 1 commit intoceph:mainfrom
tchaikov:wip-lua-fix-leak
Jun 10, 2025
Merged

rgw/rgw_lua_utils: fix memory leak in luaL_error() formatting#63819
tchaikov merged 1 commit intoceph:mainfrom
tchaikov:wip-lua-fix-leak

Conversation

@tchaikov
Copy link
Contributor

@tchaikov tchaikov commented Jun 9, 2025

Previously, error messages passed to luaL_error() were formatted using std::string concatenation. Since luaL_error() never returns (it throws a Lua exception via longjmp), the allocated std::string memory was leaked, as detected by AddressSanitizer:

Direct leak of 105 byte(s) in 1 object(s) allocated from:
    #0 0x7fc5f1921a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
    #1 0x563bd89144c7 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/include/c++/15.1.1/bits/new_allocator.h:151
    #2 0x563bd89144c7 in std::allocator<char>::allocate(unsigned long) /usr/include/c++/15.1.1/bits/allocator.h:203
    #3 0x563bd89144c7 in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/alloc_traits.h:614
    #4 0x563bd89144c7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.h:142
    #5 0x563bd89144c7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:164
    #6 0x563bd896ae1b in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:363
    #7 0x563bd896b256 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:455
    #8 0x563bd896b2bb in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(char const*) /usr/include/c++/15.1.1/bits/basic_string.h:1585
    #9 0x563bd943c2f2 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, char const*) /usr/include/c++/15.1.1/bits/basic_string.h:3977
    #10 0x563bd943c2f2 in rgw::lua::lua_state_guard::runtime_hook(lua_State*, lua_Debug*) /home/kefu/dev/ceph/src/rgw/rgw_lua_utils.cc:245
    #11 0x7fc5f139f8ef  (/usr/lib/liblua.so.5.4+0xe8ef) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    #12 0x7fc5f139fbfe  (/usr/lib/liblua.so.5.4+0xebfe) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    #13 0x7fc5f13b26fe  (/usr/lib/liblua.so.5.4+0x216fe) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    #14 0x7fc5f139f581  (/usr/lib/liblua.so.5.4+0xe581) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    #15 0x7fc5f139b735  (/usr/lib/liblua.so.5.4+0xa735) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    #16 0x7fc5f139ba8f  (/usr/lib/liblua.so.5.4+0xaa8f) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    #17 0x7fc5f139f696 in lua_pcallk (/usr/lib/liblua.so.5.4+0xe696) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    #18 0x563bd8a925ef in rgw::lua::request::execute(rgw::sal::Driver*, RGWREST*, OpsLogSink*, req_state*, RGWOp*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/kefu/dev/ceph/src/rgw/rgw_lua_request.cc:824
    #19 0x563bd8952e3d in TestRGWLua_LuaRuntimeLimit_Test::TestBody() /home/kefu/dev/ceph/src/test/rgw/test_rgw_lua.cc:1628
    #20 0x563bd8a37817 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
    #21 0x563bd8a509b5 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (/home/kefu/dev/ceph/build/bin/unittest_rgw_lua+0x11199b5) (BuildId: b2628caba5290d882d25f7bea166f058b682bc85)`

This change replaces std::string formatting with stack-allocated buffer and std::to_chars() to eliminate the memory leak.

Note: We cannot format int64_t directly through luaL_error() because lua_pushfstring() does not support long long or int64_t format specifiers, even in Lua 5.4 (see https://www.lua.org/manual/5.4/manual.html#lua_pushfstring). Since libstdc++ uses int64_t for std::chrono::milliseconds::rep, we use std::to_chars() for safe, efficient conversion without heap allocation.

The maximum runtime limit was a configuration introduced 3e3cb15.
Fixes: https://tracker.ceph.com/issues/71595

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

@tchaikov tchaikov requested a review from a team as a code owner June 9, 2025 12:02
@github-actions github-actions bot added the rgw label Jun 9, 2025
Previously, error messages passed to luaL_error() were formatted using
std::string concatenation. Since luaL_error() never returns (it throws
a Lua exception via longjmp), the allocated std::string memory was
leaked, as detected by AddressSanitizer:

```
Direct leak of 105 byte(s) in 1 object(s) allocated from:
    #0 0x7fc5f1921a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
    #1 0x563bd89144c7 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/include/c++/15.1.1/bits/new_allocator.h:151
    #2 0x563bd89144c7 in std::allocator<char>::allocate(unsigned long) /usr/include/c++/15.1.1/bits/allocator.h:203
    #3 0x563bd89144c7 in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/alloc_traits.h:614
    #4 0x563bd89144c7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.h:142
    #5 0x563bd89144c7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:164
    #6 0x563bd896ae1b in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:363
    #7 0x563bd896b256 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:455
    #8 0x563bd896b2bb in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(char const*) /usr/include/c++/15.1.1/bits/basic_string.h:1585
    #9 0x563bd943c2f2 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, char const*) /usr/include/c++/15.1.1/bits/basic_string.h:3977
    #10 0x563bd943c2f2 in rgw::lua::lua_state_guard::runtime_hook(lua_State*, lua_Debug*) /home/kefu/dev/ceph/src/rgw/rgw_lua_utils.cc:245
    #11 0x7fc5f139f8ef  (/usr/lib/liblua.so.5.4+0xe8ef) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    #12 0x7fc5f139fbfe  (/usr/lib/liblua.so.5.4+0xebfe) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#13 0x7fc5f13b26fe  (/usr/lib/liblua.so.5.4+0x216fe) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#14 0x7fc5f139f581  (/usr/lib/liblua.so.5.4+0xe581) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#15 0x7fc5f139b735  (/usr/lib/liblua.so.5.4+0xa735) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#16 0x7fc5f139ba8f  (/usr/lib/liblua.so.5.4+0xaa8f) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#17 0x7fc5f139f696 in lua_pcallk (/usr/lib/liblua.so.5.4+0xe696) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#18 0x563bd8a925ef in rgw::lua::request::execute(rgw::sal::Driver*, RGWREST*, OpsLogSink*, req_state*, RGWOp*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/kefu/dev/ceph/src/rgw/rgw_lua_request.cc:824
    ceph#19 0x563bd8952e3d in TestRGWLua_LuaRuntimeLimit_Test::TestBody() /home/kefu/dev/ceph/src/test/rgw/test_rgw_lua.cc:1628
    ceph#20 0x563bd8a37817 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
    ceph#21 0x563bd8a509b5 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (/home/kefu/dev/ceph/build/bin/unittest_rgw_lua+0x11199b5) (BuildId: b2628caba5290d882d25f7bea166f058b682bc85)`
```

This change replaces std::string formatting with stack-allocated buffer
and std::to_chars() to eliminate the memory leak.

Note: We cannot format int64_t directly through luaL_error() because
lua_pushfstring() does not support long long or int64_t format specifiers,
even in Lua 5.4 (see https://www.lua.org/manual/5.4/manual.html#lua_pushfstring).
Since libstdc++ uses int64_t for std::chrono::milliseconds::rep, we use
std::to_chars() for safe, efficient conversion without heap allocation.

The maximum runtime limit was a configuration introduced by 3e3cb15.

Fixes: https://tracker.ceph.com/issues/71595
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
@tchaikov tchaikov force-pushed the wip-lua-fix-leak branch from cf3ed6d to d8adc49 Compare June 9, 2025 12:06
@tchaikov tchaikov requested a review from yuvalif June 9, 2025 12:06
@tchaikov
Copy link
Contributor Author

tchaikov commented Jun 9, 2025

cc @oshrey16

hi @yuvalif , hi Yuval, could you please help review this change?

@tchaikov
Copy link
Contributor Author

tchaikov commented Jun 9, 2025

jenkins test api

std::string err = "Lua runtime limit exceeded: total elapsed time is " +
std::to_string(elapsed.count()) + " ms";
luaL_error(L, "%s", err.c_str());
// +1 for max digit, +1 for null terminator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't really understand why +1 for max digit ?

Copy link
Contributor Author

@tchaikov tchaikov Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am quoting https://en.cppreference.com/w/cpp/types/numeric_limits/digits10:

The value of std::numeric_limits::digits10 is the number of base-10 digits that can be represented by the type T without change, that is, any number with this many significant decimal digits can be converted to a value of type T and back to decimal form, without change due to rounding or overflow. std::numeric_limits::digits * std::log10(2)

and

An 8-bit binary type can represent any two-digit decimal number exactly, but 3-digit decimal numbers 256..999 cannot be represented. The value of digits10 for an 8-bit type is 2 (8 * std::log10(2) is 2.41)

so, in order to represent the maximum number which can be represented with rep, for instance, int64_t, we have to add 1 to be on the safe side, as the value of digits10 is rounded down.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@tchaikov
Copy link
Contributor Author

[ RUN      ] RocksDBOption.interpret
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/TestRocksdbOptionParse.cc:73: Failure
Expected equality of these values:
  15u
    Which is: 15
  thread_list.size()
    Which is: 14

known issue, tracked by https://tracker.ceph.com/issues/66575

@tchaikov
Copy link
Contributor Author

jenkins test make check

@tchaikov
Copy link
Contributor Author

jenkins test windows

1 similar comment
@tchaikov
Copy link
Contributor Author

jenkins test windows

@tchaikov
Copy link
Contributor Author

jenkins test api

@tchaikov
Copy link
Contributor Author

jenkins test make check arm64

1 similar comment
@tchaikov
Copy link
Contributor Author

jenkins test make check arm64

@tchaikov tchaikov merged commit d63583f into ceph:main Jun 10, 2025
24 checks passed
@tchaikov tchaikov deleted the wip-lua-fix-leak branch June 10, 2025 10:22
@tchaikov tchaikov mentioned this pull request Jun 10, 2025
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants