Skip to content

Conversation

@enjoy-binbin
Copy link
Contributor

@enjoy-binbin enjoy-binbin commented Feb 22, 2024

Even if we have SCRIPT FLUSH ASYNC now, when there are a lot of
lua scripts, SCRIPT FLUSH ASYNC will still block the main thread.
This is because lua_close is executed in the main thread, and lua
heap needs to release a lot of memory.

In this PR, we take the current lua instance on lctx.lua and call
lua_close on it in a background thread, to close it in async way.
This is MeirShpilraien's idea.

This solves the first point in #13102.

Even if we have SCRIPT FLUSH ASYNC now, when there are a lot of
lua scripts, SCRIPT FLUSH ASYNC will still block the main thread.
This is because lua_close is executed in the main thread, and lua
heap needs to release a lot of memory.

In this PR, we take the current lua instance on lctx.lua and call
lua_close on it in a background thread, to close it in async way.
This is MeirShpilraien's idea.
@enjoy-binbin
Copy link
Contributor Author

@MeirShpilraien I meet a issue. When closing_lua in the bg thread, i can observe that the corresponding lua interpreter memory is declining (by looking at luaMemory(lua)). But the rss in the info memory still displays it:

# Memory
used_memory:34117688
used_memory_human:32.54M
used_memory_rss:11181293568
used_memory_rss_human:10.41G
used_memory_peak:3018153064
used_memory_peak_human:2.81G
used_memory_peak_perc:1.13%
used_memory_overhead:22944980
used_memory_startup:874952
used_memory_dataset:11172708
used_memory_dataset_perc:33.61%
allocator_allocated:34438672
allocator_active:41156608
allocator_resident:125444096
allocator_muzzy:0
total_system_memory:33190125568
total_system_memory_human:30.91G
used_memory_lua:31744
used_memory_vm_eval:31744
used_memory_lua_human:31.00K
used_memory_scripts_eval:0
number_of_cached_scripts:0
number_of_functions:0
number_of_libraries:0
used_memory_vm_functions:32768
used_memory_vm_total:64512
used_memory_vm_total_human:63.00K
used_memory_functions:184
used_memory_scripts:184
used_memory_scripts_human:184B
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.20
allocator_frag_bytes:6717936
allocator_rss_ratio:3.05
allocator_rss_bytes:84287488
rss_overhead_ratio:89.13
rss_overhead_bytes:11055849472
mem_fragmentation_ratio:328.11
mem_fragmentation_bytes:11147215888
mem_not_counted_for_evict:13472
mem_replication_backlog:1066212
mem_total_replication_buffers:1066208
mem_clients_slaves:0
mem_clients_normal:0
mem_cluster_links:0
mem_aof_buffer:0
mem_allocator:jemalloc-5.3.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:12276361

@MeirShpilraien
Copy link

MeirShpilraien commented Feb 22, 2024

I meet a issue. When closing_lua in the bg thread, i can observe that the corresponding lua interpreter memory is declining (by looking at luaMemory(lua)). But the rss in the info memory still displays it

And without the async you do see this memory declining?

@enjoy-binbin
Copy link
Contributor Author

enjoy-binbin commented Feb 22, 2024

And without the async you do see this memory declining?

ok, it have the same input. these lua body is about 110 bytes, it look like we need to use jemalloc on lua.

[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:14.06G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush sync  --> this take about 45s
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:10.41G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:10.41G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep lua
used_memory_lua:31744
used_memory_lua_human:31.00K

but when you do it one more time, it become ok.

[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush sync  -- this take few seconds (i guess maybe 1-2 seconds.)                                                    
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep lua
used_memory_lua:31744
used_memory_lua_human:31.00K
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush sync  -- this take about 10s
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:116.14M

@MeirShpilraien
Copy link

This is what I suspected, Lua uses libc malloc which does not return the memory the same way jmalloc does. I am not sure exactly whats the logic of returning the memory for libc malloc but I believe we can see that its not related to the background release of the Lua interpreter and we can proceed with this fix, WDYT?

@enjoy-binbin
Copy link
Contributor Author

enjoy-binbin commented Feb 22, 2024

this is another case:
edit: ignore me. The subsequent async actually uses sync because lua_scripts dict is empty. the case just like above

[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:14.06G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush async
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:14.06G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:14.06G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:13.93G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:12.16G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:11.52G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:10.66G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:10.41G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:10.41G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush async  --this take quite long, i guess 5s?
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:10.41G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:10.41G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush async -- this take 1 sec maybe
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:117.60M

@MeirShpilraien
Copy link

edit: ignore me. The subsequent async actually uses sync because lua_scripts dict is empty. the case just like above

But if its empty it should not take so long, no?

@enjoy-binbin
Copy link
Contributor Author

enjoy-binbin commented Feb 22, 2024

Yes, it's weird. I guess the second lua_close actually reclaims the memory. I'll try making lua_close always async:

@@ -204,8 +204,8 @@ void freeLuaScriptsAsync(dict *lua_scripts, lua_State *lua) {
         atomicIncr(lazyfree_objects,dictSize(lua_scripts));
         bioCreateLazyFreeJob(lazyFreeLuaScripts,2,lua_scripts,lua);
     } else {
-        dictRelease(lua_scripts);
-        lua_close(lua);
+        atomicIncr(lazyfree_objects,dictSize(lua_scripts));
+        bioCreateLazyFreeJob(lazyFreeLuaScripts,2,lua_scripts,lua);
     }
 }

ok. in this code, flush always async, but the rss won't change.

[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush async
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:10.41G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush async
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:10.41G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush async
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:10.41G

then i try to use sync, one of them will take 1s, and then the rss will change.

[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:10.41G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:10.41G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:117.52M

@MeirShpilraien
Copy link

Interesting, I wonder if its not releasing the memory because of the libc allocator or because of Lua. Any idea how we can check it?

@enjoy-binbin
Copy link
Contributor Author

Interesting, I wonder if its not releasing the memory because of the libc allocator or because of Lua. Any idea how we can check it?

@sundb do you have any thoughts?

@sundb
Copy link
Collaborator

sundb commented Feb 23, 2024

use jemalloc as thc Lua allocator, we can see that the Lua VM does get cleaned, but the memory isn't return to
system, and jemalloc and gblic malloc both have the same result.
so it doesnt relate to the background thread.

btw: using jemalloc_purge will eventually return the memory to the system.

#include <lua.h>
#include <lauxlib.h>
#include <lualib.h>

static void *myAlloc(void *ud, void *ptr, size_t osize, size_t nsize) {
    if (nsize == 0) {
        zfree(ptr);
        return NULL;
    } else {
        return zrealloc(ptr, nsize);
    }
}

#define USE_LUA_JEMALLOC 1

int main(int argc, char **argv) {
    struct timeval tv;
    int j;
    char config_from_stdin = 0;

#if USE_LUA_JEMALLOC
    lua_State* L = lua_newstate(myAlloc, NULL); /* Use je as lua allocater*/
#else
    lua_State* L = lua_open(); /* Use default(glibc) as lua allocater*/
#endif
    if (!L) {
        fprintf(stderr, "Failed to create Lua state.\n");
        return 1;
    }

    /* Populate data to lua state. */
    for (int i = 1; i <= 1000000; ++i) {
        char script[100];
        snprintf(script, sizeof(script), "print('Hello from script %d')", i);
        luaL_loadstring(L, script);
        lua_pcall(L, 0, LUA_MULTRET, 0);
    }

    lua_close(L);
    je_malloc_stats_print(NULL, NULL, NULL);          /* <- 1 */
    jemalloc_purge();

    printf("wait for 10s before exit\n");
    sleep(10);

    je_malloc_stats_print(NULL, NULL, NULL);         /* <- 2 */
    return 1;
}

output of 1)

resident:                   242216960

output of 2)

resident:                    19767296

@enjoy-binbin
Copy link
Contributor Author

enjoy-binbin commented Feb 23, 2024

@sundb thanks, i also did relevant tests this morning using the same RDB and using jemalloc in lua seems to have an good effect (the rss is declining):

binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:12.75G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep lua
used_memory_lua:10003225600
used_memory_lua_human:9.32G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli script flush async
OK
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:12.75G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:12.23G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:9.71G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:8.65G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:4.93G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:4.81G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:2.56G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:1.50G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:1.16G
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:352.84M
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep rss_human
used_memory_rss_human:273.14M

According to the simple results, it seems that this function needs to be introduced together with lua jemalloc. In glibc, async does not seem to work well (it may even give users wrong instructions)

and btw, using jemalloc, lua looks faster. RDB loading used to take 180s, now it takes 150s. Flush sync used to take 45s, now it takes 25s .

@oranagra
Copy link
Member

i'm not sure we should tie these (get rid of the freeze by doing async, and the change of allocator) together.
i'm quite certain that libc has some documented behavior which in this case take a bit longer to return the memory to the OS, but that eventually it will, and maybe we can find an API to purge it earlier. maybe try malloc_trim?

regarding the change of allocator, indeed jemalloc is much better, and also we know it and trust it.
the problem with using it that it means the allocations of Lua will get mixed with the allocations of redis, and these can prevent redis from being able to defrag it's memory.
i think the right approach to solve that it so use a dedicated arena for that.
i.e. in the allocator callback we pass to lua_newstate, use use MALLOCX_ARENA.

@enjoy-binbin
Copy link
Contributor Author

ok, i see when i added a malloc_trim(0); just after lua_close, both sync and async work. @oranagra thanks. and now, what's next? In what way do i need to add malloc_trim to our code base?

[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep lua && ./src/redis-cli info memory | grep rss_human && ./src/redis-cli info | grep lazyfree
used_memory_lua:31744
used_memory_lua_human:31.00K
used_memory_rss_human:10.39G
lazyfree_pending_objects:12276361
lazyfreed_objects:0
[binbinzhu@VM-13-151-tencentos ~/github/redis]$ ./src/redis-cli info memory | grep lua && ./src/redis-cli info memory | grep rss_human && ./src/redis-cli info | grep lazyfreeused_memory_lua:31744
used_memory_lua_human:31.00K
used_memory_rss_human:97.94M
lazyfree_pending_objects:0
lazyfreed_objects:12276361

@oranagra
Copy link
Member

i think that we'd want to do that in both sync and async flush.
but only if redis and lua use different allocators (i.e. currently only if redis is not using libc malloc).
i suppose we'd wanna do it right after lua_close (and before lua_open if it wasn't already done).

@oranagra
Copy link
Member

the next project (if one of you wanna try it), is to use jemalloc with MALLOCX_ARENA as i suggested above.

@enjoy-binbin
Copy link
Contributor Author

i am not good at this OS stuff, so just pushing the code.
full ci: https://github.com/redis/redis/actions/runs/8051173343

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you test it on all the platforms in our CI?

@enjoy-binbin
Copy link
Contributor Author

enjoy-binbin commented Feb 28, 2024

did you test it on all the platforms in our CI?

I ran full CI once before, but there was a problem with the alpine build. #13087 (comment)

I now re-push the new code, new full CI: https://github.com/redis/redis/actions/runs/8078422318

But I'm not very familiar with cross-platform OS, so need you to check it out.

@enjoy-binbin enjoy-binbin marked this pull request as ready for review February 28, 2024 09:54
@oranagra oranagra merged commit a7abc2f into redis:unstable Feb 28, 2024
@oranagra oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label Feb 28, 2024
@enjoy-binbin enjoy-binbin deleted the close_lua_async branch February 28, 2024 16:35
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
Even if we have SCRIPT FLUSH ASYNC now, when there are a lot of
lua scripts, SCRIPT FLUSH ASYNC will still block the main thread.
This is because lua_close is executed in the main thread, and lua
heap needs to release a lot of memory.

In this PR, we take the current lua instance on lctx.lua and call
lua_close on it in a background thread, to close it in async way.
This is MeirShpilraien's idea.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes indication that this issue needs to be mentioned in the release notes

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants