Add seq_num propagation to GPU kernel events in Kineto trace output by mdlogic · Pull Request #1296 · pytorch/kineto

mdlogic · 2026-03-11T17:28:05Z

Summary:
Propagate the NCCL collective sequence number (Seq) from CPU-side
record_param_comms events to their linked GPU kernel events in the
chrome trace JSON output.

CPU events already carry the Seq field via generic metadata serialization.
This change copies it to CONCURRENT_KERNEL events so that GPU-level
collective operations can also be correlated across ranks.

Changes:

output_json.cpp: Add kSeqNum constant and read Seq from the linked
CPU collective record's metadata, appending it to GPU kernel event args

Differential Revision: D96145504

meta-codesync · 2026-03-11T17:28:18Z

@mdlogic has exported this pull request. If you are a Meta employee, you can view the originating Diff in D96145504.

…ytorch#1296) Summary: Propagate the NCCL collective sequence number (Seq) from CPU-side record_param_comms events to their linked GPU kernel events in the chrome trace JSON output. CPU events already carry the Seq field via generic metadata serialization. This change copies it to CONCURRENT_KERNEL events so that GPU-level collective operations can also be correlated across ranks. Changes: - output_json.cpp: Add kSeqNum constant and read Seq from the linked CPU collective record's metadata, appending it to GPU kernel event args Reviewed By: scotts Differential Revision: D96145504

meta-codesync · 2026-03-12T08:21:09Z

This pull request has been merged in 2b15a60.

Bump Kineto submodule from 0035505 to 2b15a60 to include pytorch/kineto#1296 (seq_num propagation to GPU kernel events in trace output). This is needed for pytorch#177148 (NCCL sequence number tracing).

Bump Kineto submodule from 0035505 to 2b15a60 to include pytorch/kineto#1296 (seq_num propagation to GPU kernel events in trace output). This is needed so that #177148 (D96145503) can use the new Kineto APIs for NCCL sequence number tracing. ## Included kineto commits - 2b15a60 Add seq_num propagation to GPU kernel events in Kineto trace output (#1296) - 350b58f Refactor CuptiActivityProfiler.cpp to use CuptiCbidRegistry (#1297) - 1f9ceb1 Use HAS_CUPTI_RANGE_PROFILER to avoid range profiler init (#1298) - ebaac17 Add USDT log type to logger framework (#1285) - e2e7e97 Revert D94566477: Add NCCL collective sequence number (seq_num) to Kineto profiler traces - a7c5f4d Add NCCL collective sequence number (seq_num) to Kineto profiler traces (#1294) Pull Request resolved: #177298 Approved by: https://github.com/sanrise, https://github.com/malfet

meta-cla bot added the cla signed label Mar 11, 2026

meta-codesync bot added fb-exported meta-exported labels Mar 11, 2026

mdlogic force-pushed the export-D96145504 branch from d3053bd to c8385b5 Compare March 11, 2026 22:25

meta-codesync bot closed this in 2b15a60 Mar 12, 2026

facebook-github-bot added the Merged label Mar 12, 2026

This was referenced Mar 12, 2026

Update Kineto Submodule pytorch/pytorch#177297

Closed

Update Kineto Submodule pytorch/pytorch#177298

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add seq_num propagation to GPU kernel events in Kineto trace output#1296

Add seq_num propagation to GPU kernel events in Kineto trace output#1296
mdlogic wants to merge 1 commit intopytorch:mainfrom
mdlogic:export-D96145504

mdlogic commented Mar 11, 2026

Uh oh!

meta-codesync bot commented Mar 11, 2026

Uh oh!

meta-codesync bot commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mdlogic commented Mar 11, 2026

Uh oh!

meta-codesync bot commented Mar 11, 2026

Uh oh!

meta-codesync bot commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants