fix(profiling): clear stale StackChunk::previous#17043
Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 1 commit intoMar 20, 2026
Merged
Conversation
|
Bits Dev status: ✅ Done Comment @DataDog to request changes |
Contributor
|
I can only run on private repositories. |
Codeowners resolved as |
StackChunk previous
StackChunk previousStackChunk::previous
Co-authored-by: KowalskiThomas <14239160+KowalskiThomas@users.noreply.github.com>
3054bc9 to
9e3afd8
Compare
taegyunkim
approved these changes
Mar 20, 2026
gnufede
approved these changes
Mar 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This fixes a crash happening in
Frame::readcaused by stalepreviousStackChunkentries persisting across thread iterations during stack sampling.Root Cause
When the Sampling Thread samples more than one Thread, it uses the same global
StackChunkfor each Thread's stack chain.StackChunk::update_with_depthrecursively copies the linked list of_PyStackChunk's.However, when a stack chunk has no previous chunk, we would not clear the old
previouspointer. This left staleStackChunkentries from previously-sampled threads in the chain.When a subsequent Thread's frame address happened to fall within the remote address range of a stale chunk's
origin,StackChunk::resolvewould return a pointer into the stale local buffer. The stale data contained garbage field values, which would result in invalid accesses.This is the same crash signature as #16519 (which fixed a race condition on
copied_size) and #16631 (which added full-frame bounds checking). The stalepreviouschain was an additional vector for the same class of bug.This is the crash we would see: