Skip to content

Easily reproducible OOM memory leak with exclude flag. #2731

@co60ca

Description

@co60ca

Describe the bug
Duplicates in exclude list cause nmap to allocate memory on the system very quickly.

To Reproduce

./nmap --exclude 192.168.1.1,192.168.1.1 127.0.0.1

You may have to run it multiple times, if it takes more than a 5 seconds you've probably hit the bug.

./nmap --exclude 192.168.1.1,192.168.1.1 127.0.0.1
Starting Nmap 7.94 ( https://nmap.org/ ) at 2023-10-26 00:44 UTC
Killed

dmesg will tell you that the OOM killer got it.

Expected behavior
Program finishes without using all the memory.

Version info (please complete the following information):

cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.6 LTS"
Nmap version 7.94SVN ( https://nmap.org )
Platform: x86_64-unknown-linux-gnu
Compiled with: nmap-liblua-5.4.4 openssl-1.1.1f nmap-libssh2-1.10.0 nmap-libz-1.2.13 nmap-libpcre-7.6 nmap-libpcap-1.10.4 nmap-libdnet-1.12 ipv6
Compiled without:
Available nsock engines: epoll poll select
Starting Nmap 7.94SVN ( https://nmap.org ) at 2023-10-26 01:14 UTC
************************INTERFACES************************
DEV  (SHORT) IP/MASK          TYPE     UP MTU   MAC
eth0 (eth0)  169.254.50.24/16 ethernet up 1500  00:50:56:9D:9A:84
eth0 (eth0)  10.2.154.84/16 ethernet up 1500  00:50:56:9D:9A:84
lo   (lo)    127.0.0.1/8      loopback up 65536
lo   (lo)    ::1/128          loopback up 65536

**************************ROUTES**************************
DST/MASK        DEV  METRIC GATEWAY
10.2.154.1/32 eth0 100
10.2.154.0/24 eth0 0
169.254.0.0/16  eth0 0
0.0.0.0/0       eth0 100    10.2.154.1
::1/128         lo   0
::1/128         lo   256

Additional context

Seems to not happen on Nmap 7.80.

Nmap version 7.80 ( https://nmap.org )
Platform: x86_64-pc-linux-gnu
Compiled with: liblua-5.3.3 openssl-1.1.1d nmap-libssh2-1.8.2 libz-1.2.11 libpcre-8.39 libpcap-1.9.1 nmap-libdnet-1.12 ipv6
Compiled without:
Available nsock engines: epoll poll select

We are seeing this on multiple machines in multiple hypervisors. It seems like having duplicates in excludes will sometimes but not always cause the program to use all the memory the system will give it. We are going to work around but would love to see a fix to this in the next release if possible.

I built from the SVN head with ./configure && make debug but it also exhibits the bug with the 7.94 version given on your website (rpm) then converted to a deb using alien

It seems like it may be a bug in the trie implementation.

Using gdb, if you attach after the program has started (and it seems like it wont complete as above, it starts to fill memory quickly) It seems like its doing a calloc to create a new trie_node.

Program received signal SIGTSTP, Stopped (user).
0x00007f3149420713 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007f3149420713 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f3149423b95 in calloc () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x000055e9bfd5abb7 in safe_zalloc (size=48) at nbase_memalloc.c:111
#3  0x000055e9bfd5b86d in new_trie_node (addr=0x55e9e3494da0, mask=0x55e9e3494db0) at nbase_addrset.c:281
#4  0x000055e9bfd5badc in trie_split (this=0x55e9e3494da0, addr=0x7fff50ceb870, mask=0x7fff50ceb880) at nbase_addrset.c:327
#5  0x000055e9bfd5bc12 in _trie_insert (this=0x55e9e3494da0, addr=0x7fff50ceb870, mask=0x7fff50ceb880) at nbase_addrset.c:353
#6  0x000055e9bfd5c044 in trie_insert (this=0x55e9c217a020, sa=0x55e9c2179df0, bits=-1) at nbase_addrset.c:469
#7  0x000055e9bfd5c6e1 in addrset_add_spec (set=0x55e9c217a000, spec=0x7fff50ceba00 "192.168.1.1", af=2, dns=1) at nbase_addrset.c:645
#8  0x000055e9bfcdd706 in load_exclude_string (excludelist=0x55e9c217a000, s=0x55e9c21769c0 "192.168.1.1,192.168.1.1") at targets.cc:178
#9  0x000055e9bfbf60d5 in nmap_main (argc=4, argv=0x7fff50cecba8) at nmap.cc:2048
#10 0x000055e9bfd4d423 in main (argc=4, argv=0x7fff50cecba8) at main.cc:169

With a breakpoint in the loop in trie_split on this line

(gdb) b nbase_addrset.c:353
Breakpoint 1 at 0x561e7d9c9bfb: file nbase_addrset.c, line 353.
(gdb) c
Continuing.

Breakpoint 1, _trie_insert (this=0x561e8e0abff0, addr=0x7fff0bb97c40, mask=0x7fff0bb97c50) at nbase_addrset.c:353
353         trie_split(this, addr, mask);
(gdb) c
Continuing.

Breakpoint 1, _trie_insert (this=0x561e8e0ac030, addr=0x7fff0bb97c40, mask=0x7fff0bb97c50) at nbase_addrset.c:353
353         trie_split(this, addr, mask);

(continuing many times appears to keep hitting this breakpoint)

It appears that it will continue trying to split the trie many many times. This is evident by it still being in this function 5 seconds after program start. I'm not very familiar with the trie datastructure but I imagine the data structure is the cause of the bug.

If I had to guess, its just creating more and more trie nodes unnecessarily.

The below appears to not exhibit the issue, not that there are no duplicates.

./nmap --exclude 192.168.1.1/32,192.168.1.2/32,192.168.1.3/32 127.0.0.1
Starting Nmap 7.94SVN ( https://nmap.org ) at 2023-10-26 01:34 UTC
Nmap scan report for localhost (127.0.0.1)
Host is up (0.00037s latency).
Not shown: 996 closed tcp ports (reset)
PORT     STATE SERVICE
22/tcp   open  ssh
873/tcp  open  rsync
[... omitted intentionally.]

Nmap done: 1 IP address (1 host up) scanned in 1.82 seconds

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions