Skip to content

Add seq_num propagation to GPU kernel events in Kineto trace output#1296

Closed
mdlogic wants to merge 1 commit intopytorch:mainfrom
mdlogic:export-D96145504
Closed

Add seq_num propagation to GPU kernel events in Kineto trace output#1296
mdlogic wants to merge 1 commit intopytorch:mainfrom
mdlogic:export-D96145504

Conversation

@mdlogic
Copy link
Contributor

@mdlogic mdlogic commented Mar 11, 2026

Summary:
Propagate the NCCL collective sequence number (Seq) from CPU-side
record_param_comms events to their linked GPU kernel events in the
chrome trace JSON output.

CPU events already carry the Seq field via generic metadata serialization.
This change copies it to CONCURRENT_KERNEL events so that GPU-level
collective operations can also be correlated across ranks.

Changes:

  • output_json.cpp: Add kSeqNum constant and read Seq from the linked
    CPU collective record's metadata, appending it to GPU kernel event args

Differential Revision: D96145504

@meta-cla meta-cla bot added the cla signed label Mar 11, 2026
@meta-codesync
Copy link

meta-codesync bot commented Mar 11, 2026

@mdlogic has exported this pull request. If you are a Meta employee, you can view the originating Diff in D96145504.

…ytorch#1296)

Summary:

Propagate the NCCL collective sequence number (Seq) from CPU-side
record_param_comms events to their linked GPU kernel events in the
chrome trace JSON output.

CPU events already carry the Seq field via generic metadata serialization.
This change copies it to CONCURRENT_KERNEL events so that GPU-level
collective operations can also be correlated across ranks.

Changes:
- output_json.cpp: Add kSeqNum constant and read Seq from the linked
  CPU collective record's metadata, appending it to GPU kernel event args

Reviewed By: scotts

Differential Revision: D96145504
@meta-codesync
Copy link

meta-codesync bot commented Mar 12, 2026

This pull request has been merged in 2b15a60.

mdlogic added a commit to mdlogic/pytorch that referenced this pull request Mar 12, 2026
Bump Kineto submodule from 0035505 to 2b15a60 to include
pytorch/kineto#1296 (seq_num propagation to GPU kernel events
in trace output). This is needed for pytorch#177148
(NCCL sequence number tracing).
mdlogic added a commit to mdlogic/pytorch that referenced this pull request Mar 12, 2026
Bump Kineto submodule from 0035505 to 2b15a60 to include
pytorch/kineto#1296 (seq_num propagation to GPU kernel events
in trace output). This is needed for pytorch#177148
(NCCL sequence number tracing).
mdlogic added a commit to mdlogic/pytorch that referenced this pull request Mar 12, 2026
Bump Kineto submodule from 0035505 to 2b15a60 to include
pytorch/kineto#1296 (seq_num propagation to GPU kernel events
in trace output). This is needed for pytorch#177148
(NCCL sequence number tracing).
mdlogic added a commit to mdlogic/pytorch that referenced this pull request Mar 12, 2026
Bump Kineto submodule from 0035505 to 2b15a60 to include
pytorch/kineto#1296 (seq_num propagation to GPU kernel events
in trace output). This is needed for pytorch#177148
(NCCL sequence number tracing).
mdlogic added a commit to mdlogic/pytorch that referenced this pull request Mar 13, 2026
Bump Kineto submodule from 0035505 to 2b15a60 to include
pytorch/kineto#1296 (seq_num propagation to GPU kernel events
in trace output). This is needed for pytorch#177148
(NCCL sequence number tracing).
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Mar 13, 2026
Bump Kineto submodule from 0035505 to 2b15a60 to include pytorch/kineto#1296 (seq_num propagation to GPU kernel events in trace output).

This is needed so that #177148 (D96145503) can use the new Kineto APIs for NCCL sequence number tracing.

## Included kineto commits
- 2b15a60 Add seq_num propagation to GPU kernel events in Kineto trace output (#1296)
- 350b58f Refactor CuptiActivityProfiler.cpp to use CuptiCbidRegistry (#1297)
- 1f9ceb1 Use HAS_CUPTI_RANGE_PROFILER to avoid range profiler init (#1298)
- ebaac17 Add USDT log type to logger framework (#1285)
- e2e7e97 Revert D94566477: Add NCCL collective sequence number (seq_num) to Kineto profiler traces
- a7c5f4d Add NCCL collective sequence number (seq_num) to Kineto profiler traces (#1294)
Pull Request resolved: #177298
Approved by: https://github.com/sanrise, https://github.com/malfet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants