Skip to content

[World] Fix Rarer Reload Deadlock#4893

Merged
Akkadius merged 1 commit intomasterfrom
akkadius/fix-world-reload-deadlock
May 16, 2025
Merged

[World] Fix Rarer Reload Deadlock#4893
Akkadius merged 1 commit intomasterfrom
akkadius/fix-world-reload-deadlock

Conversation

@Akkadius
Copy link
Copy Markdown
Contributor

Description

This is a PR that addresses a rarer deadlock that can occur in world when issuing reloads. Currently when reload requests come from Spire, they come in on a separate thread (telnet server). For some reload types we will reload things for World in the same thread the request came in on from Spire, this can cause concurrency issues between threads.

We now queue the reload and lock when accessing the queue by queuing the request from Spire and the processing it on the main thread loop.

Observed crash

(gdb) where
#0  futex_wait (private=0, expected=2, futex_word=0x7f09bdda3c60 <main_arena>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait_private (futex=futex@entry=0x7f09bdda3c60 <main_arena>) at ./nptl/lowlevellock.c:34
#2  0x00007f09bdc69bb8 in __GI___libc_malloc (bytes=bytes@entry=4096) at ./malloc/malloc.c:3321
#3  0x00007f09bdc468cc in __GI__IO_file_doallocate (fp=0x564a7a6a81e0) at ./libio/filedoalloc.c:101
#4  0x00007f09bdc540a0 in __GI__IO_doallocbuf (fp=0x564a7a6a81e0) at ./libio/libioP.h:947
#5  __GI__IO_doallocbuf (fp=fp@entry=0x564a7a6a81e0) at ./libio/genops.c:342
#6  0x00007f09bdc53204 in _IO_new_file_underflow (fp=0x564a7a6a81e0) at ./libio/fileops.c:485
#7  0x00007f09bdc54152 in __GI__IO_default_uflow (fp=0x564a7a6a81e0) at ./libio/libioP.h:947
#8  0x00007f09bdc47f8a in __GI__IO_getline_info (fp=fp@entry=0x564a7a6a81e0, buf=buf@entry=0x7ffe3fb60a70 "\220\v\266?\376\177", n=n@entry=127, delim=delim@entry=10,
    extract_delim=extract_delim@entry=1, eof=eof@entry=0x0) at ./libio/iogetline.c:60
#9  0x00007f09bdc48088 in __GI__IO_getline (fp=fp@entry=0x564a7a6a81e0, buf=buf@entry=0x7ffe3fb60a70 "\220\v\266?\376\177", n=n@entry=127, delim=delim@entry=10,
    extract_delim=extract_delim@entry=1) at ./libio/iogetline.c:34
#10 0x00007f09bdc470ce in _IO_fgets (buf=0x7ffe3fb60a70 "\220\v\266?\376\177", n=128, fp=0x564a7a6a81e0) at ./libio/iofgets.c:53
#11 0x0000564a6e15f230 in Process::execute (cmd=...) at /home/eqemu/code/common/process/process.cpp:13
#12 0x0000564a6e0054aa in print_trace () at /home/eqemu/code/common/crash.cpp:238
#13 <signal handler called>
#14 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#15 0x00007f09bdc5bf1f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#16 0x00007f09bdc0cfb2 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#17 0x00007f09bdbf7472 in __GI_abort () at ./stdlib/abort.c:79
#18 0x00007f09bdc50430 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f09bdd6a459 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#19 0x00007f09bdc6583a in malloc_printerr (str=str@entry=0x7f09bdd6d5f0 "malloc(): unsorted double linked list corrupted") at ./malloc/malloc.c:5660
#20 0x00007f09bdc68b1c in _int_malloc (av=av@entry=0x7f09bdda3c60 <main_arena>, bytes=bytes@entry=8176) at ./malloc/malloc.c:4006
#21 0x00007f09bdc69989 in __GI___libc_malloc (bytes=8176) at ./malloc/malloc.c:3323
#22 0x00007f09bea24179 in ?? () from /lib/x86_64-linux-gnu/libmariadb.so.3
#23 0x00007f09bea258c9 in ?? () from /lib/x86_64-linux-gnu/libmariadb.so.3
#24 0x00007f09bea2af07 in mysql_optionsv () from /lib/x86_64-linux-gnu/libmariadb.so.3
#25 0x00007f09bea2d0c9 in mysql_real_query () from /lib/x86_64-linux-gnu/libmariadb.so.3
#26 0x0000564a6e080eed in DBcore::Open (this=this@entry=0x564a6e4de978 <player_event_logs>, errnum=errnum@entry=0x0, errbuf=errbuf@entry=0x0)
    at /home/eqemu/code/common/dbcore.cpp:264
#27 0x0000564a6e08063d in DBcore::QueryDatabase (this=this@entry=0x564a6e4de978 <player_event_logs>,
    query=query@entry=0x564a7aacc7d0 "SELECT id, event_name, event_enabled, retention_days, discord_webhook_id, etl_enabled FROM player_event_log_settings",
    querylen=querylen@entry=116, retryOnFailureOnce=false) at /home/eqemu/code/common/dbcore.cpp:100
#28 0x0000564a6e080735 in DBcore::QueryDatabase (this=0x564a6e4de978 <player_event_logs>, query=0x6 <error: Cannot access memory at address 0x6>, querylen=116,
    retryOnFailureOnce=true) at /home/eqemu/code/common/dbcore.cpp:116
#29 0x0000564a6e0805ca in DBcore::QueryDatabase (this=0x1977a2, query=..., retryOnFailureOnce=255) at /home/eqemu/code/common/dbcore.cpp:80
#30 0x0000564a6e0c826c in BasePlayerEventLogSettingsRepository::All (db=...)
    at /home/eqemu/code/common/events/../repositories/base/base_player_event_log_settings_repository.h:266
#31 0x0000564a6e0c4802 in PlayerEventLogs::ReloadSettings (this=0x564a6e4de978 <player_event_logs>) at /home/eqemu/code/common/events/player_event_logs.cpp:1045
#32 0x0000564a6dfc3dc7 in ZSList::SendServerReload (this=0x564a6e4d70b0 <zoneserver_list>, type=<optimized out>, packet=<optimized out>)
    at /home/eqemu/code/world/zonelist.cpp:974
#33 0x0000564a6df3d020 in EQEmuApiWorldDataService::reload (r=..., args=std::vector of length 2, capacity 4 = {...})
    at /home/eqemu/code/world/eqemu_api_world_data_service.cpp:165
--Type <RET> for more, q to quit, c to continue without paging--
#34 0x0000564a6df3dd8b in EQEmuApiWorldDataService::get (r=..., args=std::vector of length 2, capacity 4 = {...}) at /home/eqemu/code/world/eqemu_api_world_data_service.cpp:197
#35 0x0000564a6df1fa22 in ConsoleApi (connection=0x564a7baa22d0, command=..., args=std::vector of length 2, capacity 4 = {...}) at /home/eqemu/code/world/console.cpp:98
#36 0x0000564a6df29162 in std::__invoke_impl<void, void (*&)(EQ::Net::ConsoleServerConnection*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&), EQ::Net::ConsoleServerConnection*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&> (
    __f=<error reading variable: Cannot access memory at address 0x0>, __args=std::vector of length 160150457351686043, capacity 34823723970992417 = {...},
    __args=std::vector of length 160150457351686043, capacity 34823723970992417 = {...}, __args=std::vector of length 160150457351686043, capacity 34823723970992417 = {...})
    at /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:61
#37 std::__invoke<void (*&)(EQ::Net::ConsoleServerConnection*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&), EQ::Net::ConsoleServerConnection*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&> (
    __fn=<error reading variable: Cannot access memory at address 0x0>, __args=std::vector of length 160150457351686043, capacity 34823723970992417 = {...},
    __args=std::vector of length 160150457351686043, capacity 34823723970992417 = {...}, __args=std::vector of length 160150457351686043, capacity 34823723970992417 = {...})
    at /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:96
#38 std::_Bind<void (*(std::_Placeholder<1>, std::_Placeholder<2>, std::_Placeholder<3>))(EQ::Net::ConsoleServerConnection*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)>::__call<void, EQ::Net::ConsoleServerConnection*&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, 0ul, 1ul, 2ul>(std::tuple<EQ::Net::ConsoleServerConnection*&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&>&&, std::_Index_tuple<0ul, 1ul, 2ul>) (this=0x0, __args=...)
    at /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/functional:484
#39 std::_Bind<void (*(std::_Placeholder<1>, std::_Placeholder<2>, std::_Placeholder<3>))(EQ::Net::ConsoleServerConnection*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)>::operator()<EQ::Net::ConsoleServerConnection*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, void>(EQ::Net::ConsoleServerConnection*&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) (this=0x0, __args=std::vector of length 160150457351686043, capacity 34823723970992417 = {...},
    __args=std::vector of length 160150457351686043, capacity 34823723970992417 = {...}, __args=std::vector of length 160150457351686043, capacity 34823723970992417 = {...})
    at /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/functional:567
#40 std::__invoke_impl<void, std::_Bind<void (*(std::_Placeholder<1>, std::_Placeholder<2>, std::_Placeholder<3>))(EQ::Net::ConsoleServerConnection*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)>&, EQ::Net::ConsoleServerConnection*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&>(std::__invoke_other, std::_Bind<void (*(std::_Placeholder<1>, std::_Placeholder<2>, std::_Placeholder<3>))(EQ::Net::ConsoleServerConnection*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)>&, EQ::Net::ConsoleServerConnection*&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) (__f=...,
    __args=std::vector of length 160150457351686043, capacity 34823723970992417 = {...}, __args=std::vector of length 160150457351686043, capacity 34823723970992417 = {...},
--Type <RET> for more, q to quit, c to continue without paging--quit
@Akkadius
Comment

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Testing

  World |    Info    | reload Queueing reload of type [Logs] to zones 
 World |    Info    | Process Sending reload of type [Logs] to zones 
 World |    Info    | SendServerReload Sending reload to all zones for type [Logs] 
 World |    Info    | LoadLogDatabaseSettings Loaded [84] log categories 
 World |    Info    | LoadLogDatabaseSettings Loaded [1] Discord webhooks 
  Zone |    Info    | QueueReload Queuing reload for [Logs] (13) to reload in [1 Second] 
  Zone |    Info    | QueueReload Queuing reload for [Logs] (13) to reload in [1 Second] 
  Zone |    Info    | QueueReload Queuing reload for [Logs] (13) to reload in [1 Second] 
  Zone |    Info    | QueueReload Queuing reload for [Logs] (13) to reload in [1 Second] 
  Zone |    Info    | ProcessReload Reloading [Logs] (13) zone booted required [false] 
  Zone |    Info    | LoadLogDatabaseSettings Loaded [84] log categories 
  Zone |    Info    | LoadLogDatabaseSettings Loaded [1] Discord webhooks 
  Zone |    Info    | ProcessReload Reloaded [Logs] (13) 
  Zone |    Info    | ProcessReload Reloading [Logs] (13) zone booted required [false] 
  Zone |    Info    | LoadLogDatabaseSettings Loaded [84] log categories 
  Zone |    Info    | LoadLogDatabaseSettings Loaded [1] Discord webhooks 
  Zone |    Info    | ProcessReload Reloaded [Logs] (13) 
  Zone |    Info    | ProcessReload Reloading [Logs] (13) zone booted required [false] 
  Zone |    Info    | LoadLogDatabaseSettings Loaded [84] log categories 
  Zone |    Info    | LoadLogDatabaseSettings Loaded [1] Discord webhooks 
  Zone |    Info    | ProcessReload Reloaded [Logs] (13) 
  Zone |    Info    | ProcessReload Reloading [Logs] (13) zone booted required [false] 
  Zone |    Info    | LoadLogDatabaseSettings Loaded [84] log categories 
  Zone |    Info    | LoadLogDatabaseSettings Loaded [1] Discord webhooks 
  Zone |    Info    | ProcessReload Reloaded [Logs] (13) 

Checklist

  • I have tested my changes
  • I have performed a self-review of my code. Ensuring variables, functions and methods are named in a human-readable way, comments are added only where naming of variables, functions and methods can't give enough context.
  • I own the changes of my code and take responsibility for the potential issues that occur

@Akkadius Akkadius merged commit c7a4634 into master May 16, 2025
2 checks passed
@Akkadius Akkadius deleted the akkadius/fix-world-reload-deadlock branch May 16, 2025 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants