Skip to content

[BUG] Segfault with 6.0.9 when running under XALT #8064

@paulmelis

Description

@paulmelis

Describe the bug

We're seeing a segfault in Redis 6.0.9 when running under XALT. XALT is a framework, mostly used on HPC systems, to gather statistics on software usage. It is activated by using LD_PRELOAD of a small XALT library that saves statistics on processes being run. What happens is that redis-server segfaults when the LD_PRELOAD is active:

(gdb) bt
#0  0x00002aaaab996d04 in __strchr_sse42 () from /lib64/libc.so.6
#1  0x00000000004b8c28 in spt_copyenv (oldenv=0x819ca0) at setproctitle.c:115
#2  0x00000000004b8fb0 in spt_init (argc=1, argv=0x7fffffff5a88) at setproctitle.c:198
#3  0x0000000000443404 in main (argc=1, argv=0x7fffffff5a88) at server.c:5168

We're pretty sure this is a bug (or dependence on undefined behaviour) in Redis. Looking at the environment related code in setproctitle.c, especially spt_copyenv(), looks fishy. The environment is cleared using clearenv() (through spt_clearenv() in this case) but a pointer to the original environ is saved as oldenv. The old entries in oldenv are then processed and copied to the new environment with setenv(). But the latter calls will update entries in environ. What is unclear to me is what guarantees clearenv() gives about the content of environ after it is called. It might simply set environ to NULL, but still use the same memory for setenv() calls that follow, this is not clear. As oldenv still points to the old environ content it might depend on undefined behaviour normally, but that behaviour changes when running under XALT.

To reproduce

See xalt/xalt#41 for more details. Unfortunately, building and configuring XALT is a bit too extensive to add here.

paulm@tcn180 17:19 ~/c$ tar xf ~/dl/redis-6.0.9.tar.gz 
paulm@tcn180 17:19 ~/c$ cd redis-6.0.9/
paulm@tcn180 17:19 ~/c/redis-6.0.9$ make -j32 noopt
cd src && make noopt
make[1]: Entering directory `/nfs/home4/paulm/c/redis-6.0.9/src'
    CC Makefile.dep
make[1]: Leaving directory `/nfs/home4/paulm/c/redis-6.0.9/src'
...
make[2]: Leaving directory `/nfs/home4/paulm/c/redis-6.0.9/src'
make[1]: Leaving directory `/nfs/home4/paulm/c/redis-6.0.9/src'

# Not running under XALT, all OK
paulm@tcn180 17:20 ~/c/redis-6.0.9$ src/redis-server
29208:C 16 Nov 2020 17:20:50.532 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
29208:C 16 Nov 2020 17:20:50.532 # Redis version=6.0.9, bits=64, commit=00000000, modified=0, pid=29208, just started
29208:C 16 Nov 2020 17:20:50.532 # Warning: no config file specified, using the default config. In order to specify a config file use src/redis-server /path/to/redis.conf
                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 6.0.9 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in standalone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 29208
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

29208:M 16 Nov 2020 17:20:50.533 # Server initialized
29208:M 16 Nov 2020 17:20:50.533 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
29208:M 16 Nov 2020 17:20:50.533 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
29208:M 16 Nov 2020 17:20:50.533 * Ready to accept connections
<ctrl-C>

# Get around the XALT filtering active on our system
paulm@tcn180 17:20 ~/c/redis-6.0.9$ cp src/redis-server R

# Running under XALT
paulm@tcn180 17:20 ~/c/redis-6.0.9$ gdb ./R
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-119.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /nfs/home4/paulm/c/redis-6.0.9/R...done.
(gdb) run
Starting program: /nfs/home4/paulm/c/redis-6.0.9/./R 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[Detaching after fork from child process 29174]

Program received signal SIGSEGV, Segmentation fault.
0x00002aaaab996d04 in __strchr_sse42 () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-307.el7.1.x86_64
(gdb) bt
#0  0x00002aaaab996d04 in __strchr_sse42 () from /lib64/libc.so.6
#1  0x00000000004b8c28 in spt_copyenv (oldenv=0x819ca0) at setproctitle.c:115
#2  0x00000000004b8fb0 in spt_init (argc=1, argv=0x7fffffff5a88) at setproctitle.c:198
#3  0x0000000000443404 in main (argc=1, argv=0x7fffffff5a88) at server.c:5168

Additional information

#5329 is about an unexpectedly empty environment for redis-server as reported by /proc/<pid>/environ. Looking at spt_copyenv() the environment might be empty due to the undefined behaviour being depended on, as setenv() is called for each entry in oldenv.

The trace above is on a RHEL 7.8 system, GCC 9.3.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions