fix(profiling): clean up containers post-fork#17042
fix(profiling): clean up containers post-fork#17042gh-worker-dd-mergequeue-cf854d[bot] merged 1 commit into
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
Codeowners resolved as |
c76a8d2 to
6eef3b3
Compare
15a0afe to
c0027c0
Compare
c0027c0 to
a494dba
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a494dbab41
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Co-authored-by: KowalskiThomas <14239160+KowalskiThomas@users.noreply.github.com>
a494dba to
bcdd325
Compare
|
@codex review again |
|
Codex Review: Didn't find any major issues. Keep them coming! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
This change is marked for backport to 4.6, but it conflicts with that branch. The command used to test backporting was |
|
This change is marked for backport to 4.5, but it conflicts with that branch. The command used to test backporting was |
|
This change is marked for backport to 4.7, but it conflicts with that branch. The command used to test backporting was |
Backport of PR #17042 (commit 39a5f23) to branch 4.7. ## Description Fixes a crash in the Profiler (~100/week) when `fork` is called while the Sampling Thread is actively modifying the `LRUCache` for Frames. The root cause: `postfork_child` called `frame_cache_.clear()`, which traverses the `std::list` to free nodes. If the sampling thread was mid-operation (splice in `lookup`, `emplace_front`/`pop_back` in `store`) at fork time, the list's internal pointers can be in a corrupted state in the child, causing the crash. Fix: replace `clear()` with `postfork_child()` which uses placement new to construct fresh empty containers, abandoning the old data as an intentional one-time leak. Also adds `prefork()`/`postfork_parent()` atfork handlers registered via `pthread_atfork` so that `restart_after_fork` uses a pre-saved `was_running_at_fork_` flag instead of relying on `thread_seq_num` parity (which `prefork` itself changes). ## Notes on conflict resolution No semantic conflicts. The 4.7 branch uses the same `stack_` function-name prefix as `main`, so the changes apply verbatim.
## Description This PR fixes a crash in the Profiler that occurred when `fork` was called while the Sampling Thread was actively modifying the `LRUCache` for Frames (about 100 crashes per week). ``` Error UnixSignal: Process terminated with SI_TKILL (SIGABRT) #0 0x0000ffffb293cb3c raise #1 0x0000ffffb2927e00 abort #2 0x0000ffffb2996f98 free #3 0x0000ffffa3f47cc4 std::_List_base<std::pair<unsigned long, std::unique_ptr<Frame, std::default_delete<Frame> > >, std::allocator<std::pair<unsigned long, std::unique_ptr<Frame, std::default_delete<Frame> > > > >::_M_clear #4 0x0000ffffa3f50ea4 Datadog::Sampler::postfork_child #5 0x0000ffffa3f51cc0 _stack_atfork_child #6 0x0000ffffb29c1cd4 __libc_fork #7 0x0000aaaabbd4b998 os_fork_impl #8 0x0000aaaabbdaff58 cfunction_vectorcall_NOARGS.llvm.1085662745629047260 #9 0x0000aaaabbe197d4 _PyEval_EvalFrameDefault #10 0x0000aaaabbd87548 _PyObject_VectorcallTstate.llvm.15733451206353458062 #11 0x0000aaaabbd89df8 object_vacall.llvm.15733451206353458062 #12 0x0000aaaabbe7f378 PyObject_CallFunctionObjArgs #13 0x0000ffffb1a4453c WraptFunctionWrapperBase_call (/project/src/wrapt/_wrappers.c:2455:14) #14 0x0000aaaabbe15fe0 _PyEval_EvalFrameDefault #15 0x0000aaaabbdcaa78 slot_tp_init #16 0x0000aaaabbdc32b0 type_call #17 0x0000aaaabbe15fe0 _PyEval_EvalFrameDefault #18 0x0000aaaabbd8a464 method_vectorcall.llvm.15917642511948773313 #19 0x0000aaaabbe197d4 _PyEval_EvalFrameDefault #20 0x0000aaaabbd8e354 gen_iternext #21 0x0000aaaabbde9a58 builtin_next #22 0x0000aaaabbe187d0 _PyEval_EvalFrameDefault #23 0x0000aaaabbd8a4c4 method_vectorcall.llvm.15917642511948773313 #24 0x0000aaaabbe1a02c _PyEval_EvalFrameDefault #25 0x0000aaaabbd8a4c4 method_vectorcall.llvm.15917642511948773313 #26 0x0000aaaabbe1a02c _PyEval_EvalFrameDefault #27 0x0000aaaabbd89480 _PyObject_Call_Prepend #28 0x0000aaaabbebba50 slot_tp_call #29 0x0000aaaabbe15fe0 _PyEval_EvalFrameDefault #30 0x0000aaaabbf15ea4 PyEval_EvalCode #31 0x0000aaaabbf48b8c run_mod.llvm.17615948296688877498 #32 0x0000aaaabbcb34b4 pyrun_file #33 0x0000aaaabbcb288c _PyRun_SimpleFileObject #34 0x0000aaaabbcb2488 _PyRun_AnyFileObject #35 0x0000aaaabbcbfc1c pymain_run_file_obj #36 0x0000aaaabbcbf994 pymain_run_file #37 0x0000aaaabbf60d2c Py_RunMain #38 0x0000aaaabbf611f0 pymain_main.llvm.144270552765144348 #39 0x0000aaaabbe589d0 main #40 0x0000ffffb2928598 __libc_start_main ``` The Sampling Thread is not stopped before forking. This means when `fork` is called, the Sampling Thread could be mutating `frame_cache_` (backed by `std::list` + `std::unordered_map`) via `lookup` (splice) and `store` (`emplace_front`/`pop_back`). As a result, those data structures could end up in a corrupt state at the moment the child process is created. When the child process resumes work, calling `frame_cache_.clear()` post-fork, the crash can happen. The most natural thing to do to avoid this would be to ask the Sampling Thread to stop at fork time, then making sure it is restarted both in the parent and the child post-fork (if it was running). However, this approach has a significant caveat: depending on the current state of adaptive sampling, doing this may result in blocking the fork for up to 100ms (worst case scenario) which is not an acceptable side effect for the user. By not stopping the Thread but cleaning up in the child process, we keep state consistency without having to wait, at the cost of a small one-time leak (the data in the containers that we placement-new). Co-authored-by: thomas.kowalski <thomas.kowalski@datadoghq.com>
…s to CI **Deterministic unit test** (`test_ddup_atfork_handler_registered_before_stack_sampler`): - Mocks `ddup.start` and `StackCollector._init` to record their call order during profiler startup, then asserts ddup was first. - POSIX guarantees FIFO ordering for post-fork child handlers, so whichever component calls pthread_atfork first gets its child handler run first. ddup must run first to rebuild profile state before the stack sampler clears and re-registers threads. - Catches the regression deterministically without needing timing luck or fork() calls — a swap of the init order in profiler.py fails immediately. - Regression guard for PRs #17183/#17042/#18063. **CI: stress-ng CPU saturation + 20 iterations**: - Runs `stress-ng --cpu $(nproc)` in the background during the fork loop to saturate all CPUs and widen the pthread_atfork timing race window — the same load profile that makes the bug manifest in production Gunicorn/gthread. - Increased iterations from 10 to 20 to give more chances to surface the race. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Description
This PR fixes a crash in the Profiler that occurred when
forkwas called while the Sampling Thread was actively modifying theLRUCachefor Frames (about 100 crashes per week).The Sampling Thread is not stopped before forking. This means when
forkis called, the Sampling Thread could be mutatingframe_cache_(backed bystd::list+std::unordered_map) vialookup(splice) andstore(emplace_front/pop_back). As a result, those data structures could end up in a corrupt state at the moment the child process is created. When the child process resumes work, callingframe_cache_.clear()post-fork, the crash can happen.The most natural thing to do to avoid this would be to ask the Sampling Thread to stop at fork time, then making sure it is restarted both in the parent and the child post-fork (if it was running). However, this approach has a significant caveat: depending on the current state of adaptive sampling, doing this may result in blocking the fork for up to 100ms (worst case scenario) which is not an acceptable side effect for the user.
By not stopping the Thread but cleaning up in the child process, we keep state consistency without having to wait, at the cost of a small one-time leak (the data in the containers that we placement-new).