Lazy load driver APIs using cudaGetDriverEntryPoint by wujingyue · Pull Request #4197 · NVIDIA/Fuser

wujingyue · 2025-04-04T22:51:39Z

This is apparently more robust than #4196 because it doesn't hard code the version.

Fixes #3907

cc @samnordmann

for #3907

github-actions · 2025-04-04T22:52:29Z

Review updated until commit 2c4c299

Description

Updated driver API loading to use cudaGetDriverEntryPoint
Applied changes to all driver APIs
Cleaned up and organized includes

Changes walkthrough 📝

Relevant files

Enhancement

driver_api.cpp `Update driver API loading mechanism` csrc/driver_api.cpp Updated macro to use cudaGetDriverEntryPoint for lazy loading Added static variables and std::once_flag for thread-safe initialization Organized and cleaned up includes	+34/-25
driver_api.h `Update driver API declarations` csrc/driver_api.h Updated macro to use PFN_ for function pointers Corrected function name in CUDA 12+ macro Organized and cleaned up includes	+3/-3

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 No relevant tests
⚡ Recommended focus areas for review Possible Issue The use of `cudaGetDriverEntryPoint` with `cudaEnableDefault` might not be the correct flag to use. The flag `cudaEnableDefault` is not a standard CUDA flag and might lead to undefined behavior. NVFUSER_CUDA_RT_SAFE_CALL(cudaGetDriverEntryPoint( \ #funcName, reinterpret_cast<void>(&f), cudaEnableDefault)); \ Typedef Consistency** The macro `DECLARE_DRIVER_API_WRAPPER` now uses `PFN_##funcName` instead of `decltype(::funcName)`. Ensure that this change is consistent with the rest of the codebase and does not introduce any type mismatches. #define DECLARE_DRIVER_API_WRAPPER(funcName) extern PFN_##funcName funcName; API Versioning* The macro `ALL_DRIVER_API_WRAPPER` now includes `cuStreamWriteValue32` instead of `cuStreamWriteValue32_v2`. Verify that this change is compatible with the CUDA version requirements and does not introduce any versioning issues. fn(cuStreamWriteValue32); \

csrc/driver_api.cpp

wujingyue · 2025-04-04T22:57:58Z

!test

samnordmann · 2025-04-07T09:25:50Z

This is apparently more robust than #4196 because it doesn't hard code the version.

Fixes #3907

cc @samnordmann

Thank you for the fix!

Why applying it only to cuStreamWriteValue32? I'm afraid we're gonna run into the issue with other API calls (e.g., with cuMemGetAddressRange).
Do you suggest we add each API function one by one here whenever it causes an issue?

csrc/driver_api.cpp

wujingyue · 2025-04-09T06:34:04Z

!test

wujingyue · 2025-04-09T23:15:42Z

!test

wujingyue · 2025-04-10T02:09:04Z

!test

It's no longer needed after #4197

@nWEIdia

Fixes #154073 Reference: NVIDIA/Fuser#4197 See PR #154097 @nWEIdia is currently out of the office, so I’ve temporarily taken over his work. Pull Request resolved: #156097 Approved by: https://github.com/ngimel, https://github.com/cyyever Co-authored-by: Wei Wang <weiwan@nvidia.com>

@nWEIdia

Fixes #154073 Reference: NVIDIA/Fuser#4197 See PR #154097 @nWEIdia is currently out of the office, so I’ve temporarily taken over his work. Pull Request resolved: #156097 Approved by: https://github.com/ngimel Co-authored-by: Wei Wang <weiwan@nvidia.com>

@nWEIdia

Fixes #154073 Reference: NVIDIA/Fuser#4197 See PR #154097 @nWEIdia is currently out of the office, so I’ve temporarily taken over his work. Pull Request resolved: #156097 Approved by: https://github.com/syed-ahmed, https://github.com/wujingyue, https://github.com/atalman Co-authored-by: Wei Wang <weiwan@nvidia.com>

Reopen #156097 Fixes #154073 Reference: NVIDIA/Fuser#4197 See PR #156097 and #154097 Pull Request resolved: #158295 Approved by: https://github.com/Skylion007, https://github.com/ngimel, https://github.com/eqy, https://github.com/huydhn Co-authored-by: Wei Wang <weiwan@nvidia.com>

Reopen #156097 Fixes #154073 Reference: NVIDIA/Fuser#4197 See PR #156097 and #154097 Pull Request resolved: #158295 Approved by: https://github.com/Skylion007, https://github.com/ngimel, https://github.com/eqy, https://github.com/huydhn Co-authored-by: Wei Wang <weiwan@nvidia.com> (cherry picked from commit a9f902a)

[CUDA] Use runtime driver API for cuStreamWriteValue32 (#158295) Reopen #156097 Fixes #154073 Reference: NVIDIA/Fuser#4197 See PR #156097 and #154097 Pull Request resolved: #158295 Approved by: https://github.com/Skylion007, https://github.com/ngimel, https://github.com/eqy, https://github.com/huydhn (cherry picked from commit a9f902a) Co-authored-by: Frank Lin <eee4017@gmail.com> Co-authored-by: Wei Wang <weiwan@nvidia.com>

[CUDA] Use runtime driver API for cuStreamWriteValue32 (pytorch#158295) Reopen pytorch#156097 Fixes pytorch#154073 Reference: NVIDIA/Fuser#4197 See PR pytorch#156097 and pytorch#154097 Pull Request resolved: pytorch#158295 Approved by: https://github.com/Skylion007, https://github.com/ngimel, https://github.com/eqy, https://github.com/huydhn (cherry picked from commit a9f902a) Co-authored-by: Frank Lin <eee4017@gmail.com> Co-authored-by: Wei Wang <weiwan@nvidia.com>

samnordmann and others added 3 commits April 4, 2025 11:15

Repro

6800584

for #3907

Move repro

c9feb0b

Fix using cudaGetDriverEntryPoint

f70c8a2

Clean

e4cf98e

wujingyue commented Apr 4, 2025

View reviewed changes

csrc/driver_api.cpp Show resolved Hide resolved

wujingyue requested a review from zasdfgbnm April 4, 2025 22:57

zasdfgbnm reviewed Apr 8, 2025

View reviewed changes

csrc/driver_api.cpp Outdated Show resolved Hide resolved

Apply to all driver APIs

e368feb

samnordmann mentioned this pull request Apr 9, 2025

Enable lazy loading for cuStreamWriteValue32 #4196

Merged

Clean

84160ae

wujingyue changed the title ~~Enable lazy loading for cuStreamWriteValue32 using cudaGetDriverEntryPoint~~ Lazy load driver APIs using cudaGetDriverEntryPoint Apr 9, 2025

wujingyue requested a review from zasdfgbnm April 9, 2025 23:16

zasdfgbnm approved these changes Apr 9, 2025

View reviewed changes

Merge branch 'main' into wjy/writevalue

2c4c299

wujingyue merged commit af63372 into main Apr 10, 2025
53 checks passed

wujingyue deleted the wjy/writevalue branch April 10, 2025 04:21

wujingyue added a commit that referenced this pull request Apr 10, 2025

Remove CUDADriverAPIDynamicLoader

f956bed

It's no longer needed after #4197

wujingyue mentioned this pull request Apr 10, 2025

Remove CUDADriverAPIDynamicLoader #4226

Merged

wujingyue added a commit that referenced this pull request Apr 10, 2025

Remove CUDADriverAPIDynamicLoader (#4226)

802f042

It's no longer needed after #4197

nWEIdia mentioned this pull request May 22, 2025

[Draft][CUDA] Use runtime driver API for cuStreamWriteValue32 pytorch/pytorch#154097

Closed

eee4017 mentioned this pull request Jun 16, 2025

[CUDA] Use runtime driver API for cuStreamWriteValue32 pytorch/pytorch#156097

Closed

eee4017 mentioned this pull request Jul 14, 2025

[CUDA] Use runtime driver API for cuStreamWriteValue32 pytorch/pytorch#158295

Closed

pytorchbot mentioned this pull request Jul 17, 2025

[CUDA] Use runtime driver API for cuStreamWriteValue32 pytorch/pytorch#158585

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lazy load driver APIs using cudaGetDriverEntryPoint#4197

Lazy load driver APIs using cudaGetDriverEntryPoint#4197
wujingyue merged 7 commits intomainfrom
wjy/writevalue

wujingyue commented Apr 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

wujingyue commented Apr 4, 2025

Uh oh!

samnordmann commented Apr 7, 2025

Uh oh!

Uh oh!

wujingyue commented Apr 9, 2025

Uh oh!

wujingyue commented Apr 9, 2025

Uh oh!

wujingyue commented Apr 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wujingyue commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes walkthrough 📝

PR Reviewer Guide 🔍

Uh oh!

Uh oh!

wujingyue commented Apr 4, 2025

Uh oh!

samnordmann commented Apr 7, 2025

Uh oh!

Uh oh!

wujingyue commented Apr 9, 2025

Uh oh!

wujingyue commented Apr 9, 2025

Uh oh!

wujingyue commented Apr 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wujingyue commented Apr 4, 2025 •

edited

Loading

github-actions bot commented Apr 4, 2025 •

edited

Loading