Script and C++ program for reproducing this behavior is available here.
$ clang++ -std=c++23 -O3 -g -o test test.cc
$ python3 test.py
[lldb:stdout] (lldb) target create test
[lldb:stdout] Current executable set to '/Users/vegorov/src/mraleph/iOS-MemoryTest/test' (arm64).
[lldb:stdout] (lldb) breakpoint set --func-regex '^BREAK_HERE$'
[lldb:stdout] Breakpoint 1: where = test`::BREAK_HERE(void *, size_t) at test.cc:14:1, address = 0x0000000100000588
[lldb:stdout] (lldb) breakpoint command add --script-type python 1
[lldb:stdout] (lldb) breakpoint list
[lldb:stdout] Current breakpoints:
[lldb:stdout] 1: regex = '^BREAK_HERE$', locations = 1
[lldb:stdout] Breakpoint commands (Python):
[lldb:stdout] ptr = frame.register["x0"].GetValueAsAddress()
[lldb:stdout] data = bytearray(2)
[lldb:stdout] data[0:2] = b'OK'
[lldb:stdout] error = lldb.SBError()
[lldb:stdout] frame.GetThread().GetProcess().WriteMemory(ptr, data, error)
[lldb:stdout] if not error.Success():
[lldb:stdout] print(f'Failed to write into {ptr}', error)
[lldb:stdout] return
[lldb:stdout] return False
[lldb:stdout]
[lldb:stdout] 1.1: where = test`::BREAK_HERE(void *, size_t) at test.cc:14:1, address = test[0x0000000100000588], unresolved, hit count = 0
[lldb:stdout]
[lldb:stdout] (lldb) run
[lldb:stdout] Process 17496 launched: '/Users/vegorov/src/mraleph/iOS-MemoryTest/test' (arm64)
[lldb:stdout] Running Breakpoint tests
[lldb:stdout] All threads are running
[lldb:stdout] Breakpoint stopped working: attempt 2!
[lldb:stdout] Process 17496 stopped
[lldb:stdout] * thread #2, stop reason = signal SIGABRT
[lldb:stdout] frame #0: 0x00000001810e85e8 libsystem_kernel.dylib`__pthread_kill + 8
[lldb:stdout] libsystem_kernel.dylib`__pthread_kill:
[lldb:stdout] -> 0x1810e85e8 <+8>: b.lo 0x1810e8608 ; <+40>
[lldb:stdout] 0x1810e85ec <+12>: pacibsp
[lldb:stdout] 0x1810e85f0 <+16>: stp x29, x30, [sp, #-0x10]!
[lldb:stdout] 0x1810e85f4 <+20>: mov x29, sp
[lldb:stdout] Target 0: (test) stopped.
Observed behavior: scripted breakpoint fires once and then LLDB fails to rearm it correctly.
The problem is reproducible in interactive way (e.g. if you type the same commands manually into TUI), but not reproducible using equivalent script (e.g. running lldb -b -s script.lldb with script.lldb containing equivalent commands).
There seems to be a bug in how SendContinuePacketAndWaitForResponse handles interfering async requests, specifically this comment and logic seems related:
|
// The packet we should resume with. In the future we should check our |
|
// thread list and "do the right thing" for new threads that show up |
|
// while we stop and run async packets. Setting the packet to 'c' to |
|
// continue all threads is the right thing to do 99.99% of the time |
|
// because if a thread was single stepping, and we sent an interrupt, we |
|
// will notice above that we didn't stop due to an interrupt but stopped |
|
// due to stepping and we would _not_ continue. This packet may get |
|
// modified by the async actions (e.g. to send a signal). |
|
m_continue_packet = 'c'; |
|
cont_lock.unlock(); |
|
|
|
delegate.HandleStopReply(); |
|
if (should_stop) |
|
return eStateStopped; |
|
|
|
switch (cont_lock.lock()) { |
It claims that the single stepping situation should somehow be handled correct, but logging gdb-remote reveals that it is not actually handled correctly:
[lldb:stdout] <lldb.process.gdb-remote.async> GDBRemoteClientBase::ContinueLock::lock() resuming with vCont;s:cb65bc
[lldb:stdout] lldb.debugger.event-handler GDBRemoteClientBase::Lock::Lock sent packet: \x03
[lldb:stdout] <lldb.process.gdb-remote.async> GDBRemoteClientBase::SendContinuePacketAndWaitForResponse () got packet: T11thread:cb65ae;threads:cb65ae,cb65bc;thread-pcs:1810e1af8,1000006a0;jstopinfo:5b7b22746964223a31333332393833382c226d6574797065223a352c226d6564617461223a5b36353533392c31375d2c22726561736f6e223a22657863657074696f6e227d2c7b22746964223a31333332393835322c226d6574797065223a362c226d6564617461223a5b312c305d2c22726561736f6e223a22657863657074696f6e227d5d;00:fcffffffffffffff;01:0000000000000000;02:030e000000000000;03:0000000000000000;04:400ac06c07000000;05:0000000000000000;06:72756e6e696e670a;07:00f8ff3f00f0ffff;08:030e000000000000;09:0200020100000000;0a:1100000000000000;0b:0000000000000000;0c:a4704d8000200000;0d:006c4d0000100000;0e:00700d0000000000;0f:0a00000000000000;10:0302000000000000;11:c05f56ee01000000;12:0000000000000000;13:0070e86f01000000;14:0200000000000000;15:3470e86f01000000;16:0200020100000000;17:a0d9f2ec01000000;18:6cc0f3ec01000000;19:1100000000000000;1a:b0f1f2ec01000000;1b:0000000000000000;1c:0000000000000000;1d:30e8df6f01000000;1e:1461128101000000;1f:d0e7df6f01000000;20:f81a0e8101000000;21:00000040;a1:0000000000000000;a2:80000056;a3:00000000;metype:5;mecount:2;medata:10003;medata:11;memory:0x16fdfe830=70e8df6f010000003c06000001000000;memory:0x16fdfe870=d0eedf6f01000000a47dd68001000000;
[lldb:stdout] <lldb.process.gdb-remote.async> GDBRemoteClientBase::ContinueLock::lock() resuming with c
So we are trying to single step thread cb65bc, while concurrently decide to send some other request which triggers Lock::Lock -> SyncWithContinueThread which sends \x03. After this interrupt we end up continuing all threads (c) which is not what we are supposed to do here because we still have pending single step.
The concurrent request (which triggers interrupt) comes from this location:
frame #0: 0x0000000114909328 liblldb.23.0.0git.dylib`lldb_private::process_gdb_remote::GDBRemoteClientBase::Lock::SyncWithContinueThread(this=0x000000016c425d68, interesting=true) at GDBRemoteClientBase.cpp:402:22 [opt]
frame #1: 0x0000000114909138 liblldb.23.0.0git.dylib`lldb_private::process_gdb_remote::GDBRemoteClientBase::Lock::Lock(this=0x000000016c425d68, comm=<unavailable>, payload=(Data = "x1810e1a00,200", Length = 14), interrupt_timeout=<unavailable>) at GDBRemoteClientBase.cpp:371:3 [opt]
frame #2: 0x0000000114908ab4 liblldb.23.0.0git.dylib`lldb_private::process_gdb_remote::GDBRemoteClientBase::Lock::Lock(this=0x000000016c425d68, comm=0x00000007b8c94c88, payload=<unavailable>, interrupt_timeout=<unavailable>) at GDBRemoteClientBase.cpp:363:30 [opt] [inlined]
frame #3: 0x0000000114908aa0 liblldb.23.0.0git.dylib`lldb_private::process_gdb_remote::GDBRemoteClientBase::SendPacketAndWaitForResponse(this=0x00000007b8c94c88, payload=(Data = "x1810e1a00,200", Length = 14), response=0x000000016c425e20, interrupt_timeout=<unavailable>, sync_on_timeout=true) at GDBRemoteClientBase.cpp:186:8 [opt]
frame #4: 0x0000000114933b14 liblldb.23.0.0git.dylib`lldb_private::process_gdb_remote::ProcessGDBRemote::DoReadMemory(this=0x00000007b8c94000, addr=6460152320, buf=0x00000007b47d5200, size=512, error=0x000000016c426150) at ProcessGDBRemote.cpp:2748:18 [opt]
frame #5: 0x0000000114646d50 liblldb.23.0.0git.dylib`lldb_private::Process::ReadMemoryFromInferior(this=0x00000007b8c94000, addr=6460152320, buf=0x00000007b47d5200, size=512, error=0x000000016c426150) at Process.cpp:2251:9 [opt]
frame #6: 0x000000011462d65c liblldb.23.0.0git.dylib`lldb_private::MemoryCache::GetL2CacheLine(this=0x00000007b8c949b0, line_base_addr=6460152320, error=0x000000016c426150) at Memory.cpp:138:41 [opt]
frame #7: 0x000000011462d9f8 liblldb.23.0.0git.dylib`lldb_private::MemoryCache::Read(this=0x00000007b8c949b0, addr=6460152560, dst=0x00000007b47a03c0, dst_len=44, error=0x000000016c426150) at Memory.cpp:209:35 [opt]
frame #8: 0x0000000114692e84 liblldb.23.0.0git.dylib`lldb_private::Target::ReadMemory(this=0x00000007b6e01400, addr=<unavailable>, dst=0x00000007b47a03c0, dst_len=44, error=0x000000016c426150, force_live_memory=<unavailable>, load_addr_ptr=0x0000000000000000, did_read_live_memory=0x0000000000000000) at Target.cpp:2078:34 [opt]
frame #9: 0x0000000114a4ae9c liblldb.23.0.0git.dylib`UnwindAssemblyInstEmulation::GetNonCallSiteUnwindPlanFromAssembly(this=0x00000007b99c0300, range=0x000000016c4261d0, thread=<unavailable>, unwind_plan=0x00000007b8e6b8f8) at UnwindAssemblyInstEmulation.cpp:45:33 [opt]
frame #10: 0x00000001145e5104 liblldb.23.0.0git.dylib`lldb_private::FuncUnwinders::GetAssemblyUnwindPlan(this=0x00000007b41d5c18, target=0x00000007b6e01400, thread=0x00000007b71ad198) at FuncUnwinders.cpp:339:31 [opt]
frame #11: 0x00000001145e5418 liblldb.23.0.0git.dylib`lldb_private::FuncUnwinders::GetUnwindPlanAtNonCallSite(this=0x00000007b41d5c18, target=0x00000007b6e01400, thread=0x00000007b71ad198) at FuncUnwinders.cpp:390:7 [opt]
frame #12: 0x000000011465ff00 liblldb.23.0.0git.dylib`lldb_private::RegisterContextUnwind::GetFullUnwindPlanForFrame(this=0x00000007b99c6400) at RegisterContextUnwind.cpp:957:46 [opt]
frame #13: 0x000000011465d8f0 liblldb.23.0.0git.dylib`lldb_private::RegisterContextUnwind::InitializeZerothFrame(this=0x00000007b99c6400) at RegisterContextUnwind.cpp:211:27 [opt]
frame #14: 0x000000011465d454 liblldb.23.0.0git.dylib`lldb_private::RegisterContextUnwind::RegisterContextUnwind(this=0x00000007b99c6400, thread=0x00000007b71ad198, next_frame=nullptr, sym_ctx=0x00000007b8de89d0, frame_number=0, unwind_lldb=0x00000007b8ff8aa0) at RegisterContextUnwind.cpp:81:5 [opt]
frame #15: 0x00000001146dd740 liblldb.23.0.0git.dylib`lldb_private::UnwindLLDB::AddFirstFrame(this=0x00000007b8ff8aa0) at UnwindLLDB.cpp:80:40 [opt]
frame #16: 0x00000001146de594 liblldb.23.0.0git.dylib`lldb_private::UnwindLLDB::DoGetFrameInfoAtIndex(this=0x00000007b8ff8aa0, idx=0, cfa=0x000000016c4269b0, pc=0x000000016c4269b8, behaves_like_zeroth_frame=0x000000016c4269af) at UnwindLLDB.cpp:398:10 [opt]
frame #17: 0x0000000114674fa0 liblldb.23.0.0git.dylib`lldb_private::Unwind::GetFrameInfoAtIndex(this=0x00000007b8ff8aa0, frame_idx=<unavailable>, cfa=0x000000016c4269b0, pc=0x000000016c4269b8, behaves_like_zeroth_frame=0x000000016c4269af) at Unwind.h:53:12 [opt] [inlined]
frame #18: 0x0000000114674f78 liblldb.23.0.0git.dylib`lldb_private::StackFrameList::FetchFramesUpTo(this=0x00000007b41d5a58, end_idx=<unavailable>, allow_interrupt=AllowInterruption) at StackFrameList.cpp:494:41 [opt]
frame #19: 0x0000000114676784 liblldb.23.0.0git.dylib`lldb_private::StackFrameList::GetFramesUpTo(this=0x00000007b41d5a58, end_idx=<unavailable>, allow_interrupt=AllowInterruption) at StackFrameList.cpp:427:21 [opt] [inlined]
frame #20: 0x0000000114676764 liblldb.23.0.0git.dylib`lldb_private::StackFrameList::GetFrameAtIndex(this=0x00000007b41d5a58, idx=0) at StackFrameList.cpp:697:7 [opt]
frame #21: 0x00000001146acf34 liblldb.23.0.0git.dylib`lldb_private::Thread::GetSelectedFrame(this=0x00000007b71ad198, select_most_relevant=<unavailable>) at Thread.cpp:288:48 [opt]
frame #22: 0x0000000114627a0c liblldb.23.0.0git.dylib`lldb_private::ExecutionContextRef::SetProcessPtr(this=0x000000016c426e30, process=<unavailable>, adopt_selected=<unavailable>) at ExecutionContext.cpp:593:26 [opt]
frame #23: 0x0000000114627b68 liblldb.23.0.0git.dylib`lldb_private::ExecutionContextRef::ExecutionContextRef(this=0x000000016c426e30, process=<unavailable>, adopt_selected=<unavailable>) at ExecutionContext.cpp:443:3 [opt] [inlined]
frame #24: 0x0000000114627b40 liblldb.23.0.0git.dylib`lldb_private::ExecutionContextRef::ExecutionContextRef(this=0x000000016c426e30, process=<unavailable>, adopt_selected=<unavailable>) at ExecutionContext.cpp:442:66 [opt]
frame #25: 0x00000001144922dc liblldb.23.0.0git.dylib`lldb_private::Debugger::DefaultEventHandler(this=0x00000007b707c000) at Debugger.cpp:2251:31 [opt]
frame #26: 0x0000000114497424 liblldb.23.0.0git.dylib`lldb_private::Debugger::StartEventHandlerThread()::$_0::operator()(this=<unavailable>) const at Debugger.cpp:2333:42 [opt] [inlined]
Script and C++ program for reproducing this behavior is available here.
Observed behavior: scripted breakpoint fires once and then LLDB fails to rearm it correctly.
The problem is reproducible in interactive way (e.g. if you type the same commands manually into TUI), but not reproducible using equivalent script (e.g. running
lldb -b -s script.lldbwithscript.lldbcontaining equivalent commands).There seems to be a bug in how
SendContinuePacketAndWaitForResponsehandles interfering async requests, specifically this comment and logic seems related:llvm-project/lldb/source/Plugins/Process/gdb-remote/GDBRemoteClientBase.cpp
Lines 131 to 146 in 3f3d27b
It claims that the single stepping situation should somehow be handled correct, but logging
gdb-remotereveals that it is not actually handled correctly:So we are trying to single step thread
cb65bc, while concurrently decide to send some other request which triggersLock::Lock -> SyncWithContinueThreadwhich sends\x03. After this interrupt we end up continuing all threads (c) which is not what we are supposed to do here because we still have pending single step.The concurrent request (which triggers interrupt) comes from this location: