Fix use-after-free issue in spt_copyenv.#8088
Conversation
Seems to have gone unnoticed for a long time, because at least with glibc it will only be triggered if setenv() was called before spt_init, which Redis doesn't. Fixes redis#8064.
|
while reviewing, it occurred to me that we may want to publish this fix to the other users of this code, but while looking for the "upstream" i found this: |
|
There is an issue that is being missed here. User programs do not have complete control how they run. If a user or site specifies the environment variable LD_PRELOAD to point to a shared library then that shared library becomes part of the execution and can run before main() is called. There are several libraries that do this trick for ELF binaries. XALT (https://github.com/xalt/xalt), DARSHAN (https://www.mcs.anl.gov/research/projects/darshan/) and the commercial tools Mistral (https://www.ellexus.com/products/mistral/) all use LD_PRELOAD to hook into an executable. You can see this trick in action with this small example. Let's assume that the following is called xalt.c: And here is a small example of your spt_copyenv routine (called tryMe.c): One can build the shared library and the tryMe exectable by: Then to run it: You can see that the first time there is no printout. But the second time the myinit() is called and it does call setenv() before main() is called. So assuming that spt_copyenv() is never called before setenv() is not safe. My tests were done using an Ubuntu system which is indeed using glibc. As far as I can tell, the problem is that you first call clearenv() then walk already freed memory with oldenv[]. I don't believe that is safe. What this routine is doing is put back all env. vars. with an equal sign. I am pretty sure that all environment variables have an equal sign in the Unix world. But why not just loop over the environment and remove all the ones that do not have an equal sign. Say something like this: |
|
Hi @rtmclay, thanks for the elaborate explanation (I'm actually quite familiar with The original As for your suggestion, IIUC the main goal of Not sure what is the purpose of skipping elements without a I'm trying to keep changes to this code minimal, to avoid breaking compatibility with the long tail of systems Redis runs on (Linux with different libcs, various BSD variants, Solaris and other SysV variants if they're still out there, etc). |
|
The man page for clearenv states under Linux: As I see it the implementation is allowed to free all the memory associated with environ. the routine spt_copyenv gets a copy of the pointer environ. Then the routine spt_copyenv walks that memory. That memory could have been freed. It clearly is when setenv() is called before spt_copyenv is called. So this means that if the LD_PRELOAD is used and the init_array is used. Then setenv() can be called before main() is called and therefore can cause an abort. This means that if a site is using both XALT and REDIS, there will be an abort every time because XALT does use the init_array to have routines called before main().which call setenv(). Sites that use XALT set LD_PRELOAD for every execution. XALT's mission is to track what programs get run on a cluster. A simple fix would be to ignore the spt_copyenv() routine under Linux. This would make REDIS safe under Linux for all tools that use LD_PRELOAD. The obvious safe way to handle this for all O.S's is to allocate space for environ, call clearenv() then use the copy of environ to reset the environment then free the copy of environ. I hope that this is clear. I am also happy to discuss this further. |
|
@rtmclay Thank you for the patience discussing this, greatly appreciate! I still think we may not be on the same page here. XALT (or anything else for that matter) uses We then reach Redis's One could wonder if we need a deep copy (i.e. duplicate each of the strings pointed to by Next we iterate our |
|
Yes, that is what I'm suggesting. Obviously what I'm looking for is a fix that allows for XALT and Redis to work together. To me, this it means that spt_copyenv() must work safely even when setenv() is called before spt_copyenv() is called. The only way to know is to test it. Some version of my test xalt.c -> libxalt.so -> LD_PRELOAD=./libxalt.so trick may be helpful. Please let me and #8064 issue know when a version of Redis is available for testing with XALT. |
|
This PR is already merged to the unstable branch, you may try that. I can confirm that it does solve the problem when testing with the example you provided. |
Seems to have gone unnoticed for a long time, because at least with glibc it will only be triggered if setenv() was called before spt_init, which Redis doesn't. Fixes redis#8064.
Seems to have gone unnoticed for a long time, because at least with glibc it will only be triggered if setenv() was called before spt_init, which Redis doesn't. Fixes redis#8064. (cherry picked from commit 7e5a631)
Seems to have gone unnoticed for a long time, because at least with
glibc it will only be triggered if setenv() was called before spt_init,
which Redis doesn't.
Fixes #8064.