Skip to content

Fix use-after-free issue in spt_copyenv.#8088

Merged
yossigo merged 3 commits intoredis:unstablefrom
yossigo:fix-spt_copyenv
Nov 24, 2020
Merged

Fix use-after-free issue in spt_copyenv.#8088
yossigo merged 3 commits intoredis:unstablefrom
yossigo:fix-spt_copyenv

Conversation

@yossigo
Copy link
Copy Markdown
Collaborator

@yossigo yossigo commented Nov 24, 2020

Seems to have gone unnoticed for a long time, because at least with
glibc it will only be triggered if setenv() was called before spt_init,
which Redis doesn't.

Fixes #8064.

Seems to have gone unnoticed for a long time, because at least with
glibc it will only be triggered if setenv() was called before spt_init,
which Redis doesn't.

Fixes redis#8064.
@yossigo yossigo requested a review from oranagra November 24, 2020 09:20
@oranagra
Copy link
Copy Markdown
Member

while reviewing, it occurred to me that we may want to publish this fix to the other users of this code, but while looking for the "upstream" i found this:
https://lists.freedesktop.org/archives/libbsd/2013-June/000080.html
i.e. a fix that at least in some ways is similar to this one.
maybe we should align with it, so that future fixes can more easily be merged?

@yossigo yossigo merged commit 7e5a631 into redis:unstable Nov 24, 2020
@yossigo yossigo deleted the fix-spt_copyenv branch November 24, 2020 15:58
@rtmclay
Copy link
Copy Markdown

rtmclay commented Nov 24, 2020

There is an issue that is being missed here. User programs do not have complete control how they run. If a user or site specifies the environment variable LD_PRELOAD to point to a shared library then that shared library becomes part of the execution and can run before main() is called. There are several libraries that do this trick for ELF binaries. XALT (https://github.com/xalt/xalt), DARSHAN (https://www.mcs.anl.gov/research/projects/darshan/) and the commercial tools Mistral (https://www.ellexus.com/products/mistral/) all use LD_PRELOAD to hook into an executable. You can see this trick in action with this small example. Let's assume that the following is called xalt.c:

#include <stdio.h>
#include <stdlib.h>
void myinit(int argc, char **argv)
{ 
  printf("This is run before main()\n");
  setenv("ABC","DEF",1);
  printf("Calling setenv()\n");
}
__attribute__((section(".init_array"))) __typeof__(myinit) *__init = myinit;

And here is a small example of your spt_copyenv routine (called tryMe.c):

#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif

#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>	

extern char **environ;

int my_copyenv(char *oldenv[])
{
   char* eq;
   int   i;

   // The clearenv() function clears the environment of all
   // name-value pairs and sets the value of the external variable
   // environ to NULL. 

   clearenv();
   for (i = 0; oldenv[i]; i++) {
    if (!(eq = strchr(oldenv[i], '=')))
      continue;

    *eq = '\0';
    setenv(oldenv[i], eq + 1, 1);
   }
}

int main(int argc, char *argv[])
{
  my_copyenv(environ);
}

One can build the shared library and the tryMe exectable by:

$ gcc -c -fPIC xalt.c ; gcc -o libxalt.so -fPIC -shared xalt.o
$ gcc -g -O0 -o tryMe  tryMe.c

Then to run it:

$ ./tryMe
$ LD_PRELOAD=./libXalt.so ./tryMe
This is run before main()
Calling setenv()

You can see that the first time there is no printout. But the second time the myinit() is called and it does call setenv() before main() is called. So assuming that spt_copyenv() is never called before setenv() is not safe. My tests were done using an Ubuntu system which is indeed using glibc.

As far as I can tell, the problem is that you first call clearenv() then walk already freed memory with oldenv[]. I don't believe that is safe. What this routine is doing is put back all env. vars. with an equal sign. I am pretty sure that all environment variables have an equal sign in the Unix world. But why not just loop over the environment and remove all the ones that do not have an equal sign. Say something like this:

int removeBadEnvVars(char *oldenv[])
{
   char* eq;
   int   i;

   for (i = 0; oldenv[i]; i++)
     {
       eq = strchr(oldenv[i], '=');
       if ( ! eq )
	 {
	   *eq = '\0';
	   unsetenv(oldenv[i]);
	 }
     }
}

@yossigo
Copy link
Copy Markdown
Collaborator Author

yossigo commented Nov 25, 2020

Hi @rtmclay, thanks for the elaborate explanation (I'm actually quite familiar with LD_PRELOAD, been creating instrumentation tools with it in the past).

The original spt_copyenv() was indeed buggy and would break if setenv() is called before it is initialized, but I don't understand what is your concern with the fix as it makes a copy of environ and no longer references freed memory. Can you please explain?

As for your suggestion, IIUC the main goal of spt_copyenv() is to relocate the environment array elsewhere as a way to make as much room as possible to overwrite argv[0], so simply calling unsetenv() may not do this trick.

Not sure what is the purpose of skipping elements without a =, but I suspect it may have some historic significance with pre-POSIX use of envp as a third argument to main.

I'm trying to keep changes to this code minimal, to avoid breaking compatibility with the long tail of systems Redis runs on (Linux with different libcs, various BSD variants, Solaris and other SysV variants if they're still out there, etc).

@rtmclay
Copy link
Copy Markdown

rtmclay commented Nov 25, 2020

The man page for clearenv states under Linux:

The  clearenv()  function clears the environment of all name-value pairs and sets the value of the external variable environ to NULL.  After this call, new variables can be added to the environment using putenv(3) and setenv(3).

As I see it the implementation is allowed to free all the memory associated with environ. the routine spt_copyenv gets a copy of the pointer environ. Then the routine spt_copyenv walks that memory. That memory could have been freed. It clearly is when setenv() is called before spt_copyenv is called.

So this means that if the LD_PRELOAD is used and the init_array is used. Then setenv() can be called before main() is called and therefore can cause an abort. This means that if a site is using both XALT and REDIS, there will be an abort every time because XALT does use the init_array to have routines called before main().which call setenv(). Sites that use XALT set LD_PRELOAD for every execution. XALT's mission is to track what programs get run on a cluster.

A simple fix would be to ignore the spt_copyenv() routine under Linux. This would make REDIS safe under Linux for all tools that use LD_PRELOAD. The obvious safe way to handle this for all O.S's is to allocate space for environ, call clearenv() then use the copy of environ to reset the environment then free the copy of environ.

I hope that this is clear. I am also happy to discuss this further.

@yossigo
Copy link
Copy Markdown
Collaborator Author

yossigo commented Nov 25, 2020

@rtmclay Thank you for the patience discussing this, greatly appreciate! I still think we may not be on the same page here.

XALT (or anything else for that matter) uses LD_PRELOAD to execute setenv() before main() is called. I assume that's valid and environ is already allocated, otherwise that would fail and Redis has nothing to do about it anyway because it did not start to execute.

We then reach Redis's main() and end up in spt_init() that calls spt_copyenv(). The new version of spt_copyenv() which is part of this PR first allocates a new envcopy array of equal size to environ and copies all char * pointers from environ to it - basically making a shallow copy of environ.

One could wonder if we need a deep copy (i.e. duplicate each of the strings pointed to by environ), however the clearenv() clearly states:

Note that the main effect of clearenv() is to adjust the value of the
pointer environ(7); this function does not erase the contents of the
buffers containing the environment definitions.

Next we iterate our envcopy and selectively call setenv() for those values that seem like legit environment variables, and free that temporary copy. Isn't that exactly what you were suggesting?

@rtmclay
Copy link
Copy Markdown

rtmclay commented Nov 25, 2020

Yes, that is what I'm suggesting. Obviously what I'm looking for is a fix that allows for XALT and Redis to work together. To me, this it means that spt_copyenv() must work safely even when setenv() is called before spt_copyenv() is called. The only way to know is to test it. Some version of my test xalt.c -> libxalt.so -> LD_PRELOAD=./libxalt.so trick may be helpful. Please let me and #8064 issue know when a version of Redis is available for testing with XALT.

@yossigo
Copy link
Copy Markdown
Collaborator Author

yossigo commented Nov 26, 2020

This PR is already merged to the unstable branch, you may try that. I can confirm that it does solve the problem when testing with the example you provided.

@yossigo yossigo mentioned this pull request Dec 8, 2020
@oranagra oranagra mentioned this pull request Jan 11, 2021
oranagra pushed a commit that referenced this pull request Jan 12, 2021
Seems to have gone unnoticed for a long time, because at least with
glibc it will only be triggered if setenv() was called before spt_init,
which Redis doesn't.

Fixes #8064.

(cherry picked from commit 7e5a631)
@oranagra oranagra mentioned this pull request Jan 13, 2021
@oranagra oranagra mentioned this pull request Feb 22, 2021
oranagra pushed a commit that referenced this pull request Feb 22, 2021
Seems to have gone unnoticed for a long time, because at least with
glibc it will only be triggered if setenv() was called before spt_init,
which Redis doesn't.

Fixes #8064.

(cherry picked from commit 7e5a631)
JackieXie168 pushed a commit to JackieXie168/redis that referenced this pull request Mar 2, 2021
Seems to have gone unnoticed for a long time, because at least with
glibc it will only be triggered if setenv() was called before spt_init,
which Redis doesn't.

Fixes redis#8064.
@oranagra oranagra mentioned this pull request Nov 5, 2021
pulllock pushed a commit to pulllock/redis that referenced this pull request Jun 28, 2023
Seems to have gone unnoticed for a long time, because at least with
glibc it will only be triggered if setenv() was called before spt_init,
which Redis doesn't.

Fixes redis#8064.

(cherry picked from commit 7e5a631)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Segfault with 6.0.9 when running under XALT

3 participants