chore(profiling): address issue found by fuzzer by KowalskiThomas · Pull Request #16631 · DataDog/dd-trace-py

KowalskiThomas · 2026-02-24T07:27:42Z

Description

https://datadoghq.atlassian.net/browse/PROF-13838

This was reported by the fuzzer and although I don't think we've ever seen it happen in real life (according to crash stacks), it's still valid and worth fixing.

cit-pr-commenter-54b7da · 2026-02-24T07:27:57Z

Codeowners resolved as

ddtrace/internal/datadog/profiling/stack/src/echion/stack_chunk.cc      @DataDog/profiling-python

taegyunkim

I was wondering how this can happen, maybe there's a chance where we copy over a StackChunk that is of size x, but then it has been incremented as the program executes, and we get an address that is above x?

taegyunkim · 2026-02-24T10:20:47Z

https://gitlab.ddbuild.io/DataDog/apm-reliability/dd-trace-py/-/jobs/1450393487

the crash from test would need some attention

There are couple more suites that crashed.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 29d5ba4db2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

## Description This fixes a crash happening in `Frame::read` caused by stale `previous` `StackChunk` entries persisting across thread iterations during stack sampling. ### Root Cause When the Sampling Thread samples more than one Thread, it uses the same global `StackChunk` for each Thread's stack chain. `StackChunk::update_with_depth` recursively copies the linked list of `_PyStackChunk`'s. However, when a stack chunk has no previous chunk, we would not clear the old `previous` pointer. This left stale `StackChunk` entries from previously-sampled threads in the chain. When a subsequent Thread's frame address happened to fall within the remote address range of a stale chunk's `origin`, `StackChunk::resolve` would return a pointer into the stale local buffer. The stale data contained garbage field values, which would result in invalid accesses. This is the same crash signature as #16519 (which fixed a race condition on `copied_size`) and #16631 (which added full-frame bounds checking). The stale `previous` chain was an additional vector for the same class of bug. This is the crash we would see: ``` #0 0x00007fa8a3507fc6 Frame::read #1 0x00007fa8a3508114 unwind_frame #2 0x00007fa8a3509c68 ThreadInfo::unwind #3 0x00007fa8a3509da2 ThreadInfo::sample #4 0x00007fa8a350a02e std::_Function_handler<void (_ts*, ThreadInfo&), Datadog::Sampler::sampling_thread(unsigned long)::{lambda(InterpreterInfo&)#1}::operator()(InterpreterInfo&) const::{lambda(_ts*, ThreadInfo&)#1}>::_M_invoke #5 0x00007fa8a350a2c0 for_each_thread #6 0x00007fa8a350a382 std::_Function_handler<void (InterpreterInfo&), Datadog::Sampler::sampling_thread(unsigned long)::{lambda(InterpreterInfo&)#1}>::_M_invoke #7 0x00007fa8a35072f9 for_each_interp #8 0x00007fa8a350a6e7 Datadog::Sampler::sampling_thread #9 0x00007fa8a350a853 call_sampling_thread ``` Co-authored-by: thomas.kowalski <thomas.kowalski@datadoghq.com>

KowalskiThomas added changelog/no-changelog A changelog entry is not required for this PR. Profiling Continous Profling labels Feb 24, 2026

KowalskiThomas force-pushed the kowalski/chore-profiling-address-issue-found-by-fuzzer branch from d6774d6 to 29d5ba4 Compare February 24, 2026 07:35

This comment has been minimized.

Sign in to view

KowalskiThomas added the identified-by:fuzzing Bug / Regression identified by fuzzing label Feb 24, 2026

taegyunkim approved these changes Feb 24, 2026

View reviewed changes

KowalskiThomas marked this pull request as ready for review February 24, 2026 10:18

KowalskiThomas requested a review from a team as a code owner February 24, 2026 10:18

KowalskiThomas requested a review from vlad-scherbich February 24, 2026 10:18

chatgpt-codex-connector Bot reviewed Feb 24, 2026

View reviewed changes

Comment thread ddtrace/internal/datadog/profiling/stack/echion/echion/stack_chunk.h Outdated

P403n1x87 reviewed Feb 24, 2026

View reviewed changes

Comment thread ddtrace/internal/datadog/profiling/stack/src/echion/stack_chunk.cc Outdated

vlad-scherbich approved these changes Mar 9, 2026

View reviewed changes

taegyunkim reviewed Mar 9, 2026

View reviewed changes

Comment thread ddtrace/internal/datadog/profiling/stack/src/echion/stack_chunk.cc Outdated

KowalskiThomas added 3 commits March 10, 2026 10:48

chore(profiling): address issue found by fuzzer

7f77a86

chore: review comments + ci failures

f0d33d1

chore: fix include issue

18d6d4d

KowalskiThomas force-pushed the kowalski/chore-profiling-address-issue-found-by-fuzzer branch from 6a66816 to 18d6d4d Compare March 10, 2026 09:48

gh-worker-dd-mergequeue-cf854d Bot merged commit 88f6b34 into main Mar 10, 2026
384 checks passed

gh-worker-dd-mergequeue-cf854d Bot deleted the kowalski/chore-profiling-address-issue-found-by-fuzzer branch March 10, 2026 11:17

KowalskiThomas mentioned this pull request Mar 20, 2026

fix(profiling): clear stale StackChunk::previous #17043

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(profiling): address issue found by fuzzer#16631

chore(profiling): address issue found by fuzzer#16631
gh-worker-dd-mergequeue-cf854d[bot] merged 3 commits into
mainfrom
kowalski/chore-profiling-address-issue-found-by-fuzzer

KowalskiThomas commented Feb 24, 2026 •

edited

Loading

Uh oh!

cit-pr-commenter-54b7da Bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

This comment has been minimized.

taegyunkim left a comment

Uh oh!

taegyunkim commented Feb 24, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

KowalskiThomas commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

cit-pr-commenter-54b7da Bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codeowners resolved as

Uh oh!

This comment has been minimized.

taegyunkim left a comment

Choose a reason for hiding this comment

Uh oh!

taegyunkim commented Feb 24, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

KowalskiThomas commented Feb 24, 2026 •

edited

Loading

cit-pr-commenter-54b7da Bot commented Feb 24, 2026 •

edited

Loading