Skip to content

Conversation

@enjoy-binbin
Copy link
Contributor

@enjoy-binbin enjoy-binbin commented Mar 13, 2024

Background

  1. Currently Lua memory control does not pass through Redis's zmalloc.c.
    Redis maxmemory cannot limit memory problems caused by users abusing lua
    since these lua VM memory is not part of used_memory.

  2. Since jemalloc is much better (fragmentation and speed), and also we know it and trust it. we are
    going to use jemalloc instead of libc to allocate the Lua VM code and
    count it used memory.

Process:

In this PR, we will use jemalloc in lua.

  1. Create an arena for all lua vm (script and function), which is shared, in order to avoid blocking defragger.
  2. Create a bound tcache for the lua VM, since the lua VM and the main thread are by default in the same tcache, and if there is no isolated tcache, lua may request memory from the tcache which has just been freed by main thread, and vice versa
    On the other hand, since lua vm might be release in bio thread, but tcache is not thread-safe, we need to recreate
    the tcache every time we recreate the lua vm.
  3. Remove lua memory statistics from memory fragmentation statistics to avoid the effects of lua memory fragmentation

Other

Add the following new fields to INFO DEBUG (we may promote them to INFO MEMORY some day)

  1. allocator_allocated_lua: total number of bytes allocated of lua arena
  2. allocator_active_lua: total number of bytes in active pages allocated in lua arena
  3. allocator_resident_lua: maximum number of bytes in physically resident data pages mapped in lua arena
  4. allocator_frag_bytes_lua: fragment bytes in lua arena

This is oranagra's idea, and i got some help from sundb.

This solves the third point in #13102.

… memory

Currently Lua memory control does not pass through Redis's zmalloc.c.
Redis maxmemory cannot limit memory problems caused by users abusing lua
since these lua VM memory is not part of used_memory.

Since jemalloc is much better and also we know it and trust it. we are
going to use jemalloc instead of libc to allocate the Lua VM code and
count it used memory.

In this PR, we will use jemalloc in lua. Use MALLOCX_ARENA to create an
arena for eval lua to avoid blocking defragger. Remove relevant statistics
in lua arena from defrag.

This is oranagra's idea, and i got some help from sundb.

This solves the third point in redis#13102.
@enjoy-binbin enjoy-binbin requested review from oranagra and sundb March 13, 2024 08:41
@oranagra oranagra linked an issue Mar 13, 2024 that may be closed by this pull request
Co-authored-by: debing.sun <debing.sun@redis.com>
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recent change LGTM. my only concern is about someone forgetting to call zmalloc_update_epoch and that a refresh_stats argument might be better.
i don't feel strongly about it.

oranagra
oranagra previously approved these changes Mar 18, 2024
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sundb
Copy link
Collaborator

sundb commented Mar 19, 2024

@enjoy-binbin Active defrag eval scripts test failed in 32bit:
https://github.com/sundb/redis/actions/runs/8327242755/job/22784502510

oranagra
oranagra previously approved these changes Apr 2, 2024
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yossigo @LiorKogan what's the process of approving new info fields?

@oranagra
Copy link
Member

oranagra commented Apr 4, 2024

we discussed the new INFO fields internally, and we prefer to avoid introducing so many metrics that the majority of users won't need or can't easily understand.
we considered moving them for now to MEMORY MALLOC-STATS, to serve just manual investigation, but maybe an easier middle ground is to put them in INFO DEBUG for now.

@sundb
Copy link
Collaborator

sundb commented Apr 4, 2024

@oranagra already moved these fields to INFO DEBUG.

@oranagra oranagra merged commit 804110a into redis:unstable Apr 16, 2024
@oranagra
Copy link
Member

Thank you all.
@sundb can you document these (i see we documented debug info fields too)

@sundb
Copy link
Collaborator

sundb commented Apr 16, 2024

@oranagra OK.

@sundb
Copy link
Collaborator

sundb commented Apr 16, 2024

doc PR: redis/redis-doc#2716

@enjoy-binbin enjoy-binbin deleted the lua_jemalloc branch April 17, 2024 08:17
@oranagra oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label May 15, 2024
oranagra added a commit to oranagra/redis that referenced this pull request Nov 20, 2024
To complement the work done in redis#13133
contains some refactoring, and also a new field in MEMORY STATS

additionally, clear scripts and stats between tests in external mode
oranagra added a commit that referenced this pull request Nov 21, 2024
…13660)

To complement the work done in #13133.
it added the script VMs memory to be counted as part of zmalloc, but
that means they
should be also counted as part of the non-value overhead.

this commit contains some refactoring to make variable names and
function names less confusing.
it also adds a new field named `script.VMs` into the `MEMORY STATS`
command.

additionally, clear scripts and stats between tests in external mode
(which is related to how this issue was discovered)
tezc added a commit that referenced this pull request Nov 21, 2024
Starting from #13133, we allocate a
jemalloc thread cache and use it for lua vm.
On certain cases, like `script flush` or `function flush` command, we
free the existing thread cache and create a new one.

Though, for `function flush`, we were not actually destroying the
existing thread cache itself. Each call creates a new thread cache on
jemalloc and we leak the previous thread cache instances. Jemalloc
allows maximum 4096 thread cache instances. If we reach this limit,
Redis prints "Failed creating the lua jemalloc tcache" log and abort.

There are other cases that can cause this memory leak, including
replication scenarios when emptyData() is called.

The implication is that it looks like redis `used_memory` is low, but
`allocator_allocated` and RSS remain high.

Co-authored-by: debing.sun <debing.sun@redis.com>
YaacovHazan pushed a commit to YaacovHazan/redis that referenced this pull request Apr 22, 2025
…3661)

Starting from redis#13133, we allocate a
jemalloc thread cache and use it for lua vm.
On certain cases, like `script flush` or `function flush` command, we
free the existing thread cache and create a new one.

Though, for `function flush`, we were not actually destroying the
existing thread cache itself. Each call creates a new thread cache on
jemalloc and we leak the previous thread cache instances. Jemalloc
allows maximum 4096 thread cache instances. If we reach this limit,
Redis prints "Failed creating the lua jemalloc tcache" log and abort.

There are other cases that can cause this memory leak, including
replication scenarios when emptyData() is called.

The implication is that it looks like redis `used_memory` is low, but
`allocator_allocated` and RSS remain high.

Co-authored-by: debing.sun <debing.sun@redis.com>
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
… memory (redis#13133)

## Background
1. Currently Lua memory control does not pass through Redis's zmalloc.c.
Redis maxmemory cannot limit memory problems caused by users abusing lua
since these lua VM memory is not part of used_memory.

2. Since jemalloc is much better (fragmentation and speed), and also we
know it and trust it. we are
going to use jemalloc instead of libc to allocate the Lua VM code and
count it used memory.

## Process:
In this PR, we will use jemalloc in lua. 
1. Create an arena for all lua vm (script and function), which is
shared, in order to avoid blocking defragger.
2. Create a bound tcache for the lua VM, since the lua VM and the main
thread are by default in the same tcache, and if there is no isolated
tcache, lua may request memory from the tcache which has just been freed
by main thread, and vice versa
On the other hand, since lua vm might be release in bio thread, but
tcache is not thread-safe, we need to recreate
    the tcache every time we recreate the lua vm.
3. Remove lua memory statistics from memory fragmentation statistics to
avoid the effects of lua memory fragmentation

## Other
Add the following new fields to `INFO DEBUG` (we may promote them to
INFO MEMORY some day)
1. allocator_allocated_lua: total number of bytes allocated of lua arena
2. allocator_active_lua: total number of bytes in active pages allocated
in lua arena
3. allocator_resident_lua: maximum number of bytes in physically
resident data pages mapped in lua arena
4. allocator_frag_bytes_lua: fragment bytes in lua arena

This is oranagra's idea, and i got some help from sundb.

This solves the third point in redis#13102.

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
…edis#13660)

To complement the work done in redis#13133.
it added the script VMs memory to be counted as part of zmalloc, but
that means they
should be also counted as part of the non-value overhead.

this commit contains some refactoring to make variable names and
function names less confusing.
it also adds a new field named `script.VMs` into the `MEMORY STATS`
command.

additionally, clear scripts and stats between tests in external mode
(which is related to how this issue was discovered)
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
…3661)

Starting from redis#13133, we allocate a
jemalloc thread cache and use it for lua vm.
On certain cases, like `script flush` or `function flush` command, we
free the existing thread cache and create a new one.

Though, for `function flush`, we were not actually destroying the
existing thread cache itself. Each call creates a new thread cache on
jemalloc and we leak the previous thread cache instances. Jemalloc
allows maximum 4096 thread cache instances. If we reach this limit,
Redis prints "Failed creating the lua jemalloc tcache" log and abort.

There are other cases that can cause this memory leak, including
replication scenarios when emptyData() is called.

The implication is that it looks like redis `used_memory` is low, but
`allocator_allocated` and RSS remain high.

Co-authored-by: debing.sun <debing.sun@redis.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes indication that this issue needs to be mentioned in the release notes

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

improvements around Lua Scripts memory usage

4 participants