Commit 4115354
Use wait stream instead of synchronize() in cudagraph warmup (#117578)
Fix for #113895
There are three phases to cudagraph trees. Warmup, recording, and execution. On recording and execution we are executing under the current_stream. In warmup we execute under a side stream that we also use for cudagraph recording so as to reuse memory.
After we execute on the side stream we need to sync the current stream to the side stream. Previously there was a `torch.cuda.synchronize` but not a `torch.cuda.current_stream().wait_stream(stream)`. This PR removes the global sync and adds a wait_stream. I have confirmed that it fixes #113895.
It's not entirely clear me why torch.cuda.synchronize would be insufficient - I would have thought the global sync would encompass the stream to stream sync. However, we do have a number of [instances](https://github.com/pytorch/pytorch/blob/main/torch/_inductor/compile_fx.py#L748-L749) throughout the code base where we do a stream->stream sync after the global sync so clearly I am missing something here. In any case the stream->stream sync is better perf than a global synchronize.
Pull Request resolved: #117578
Approved by: https://github.com/zdevito1 parent 560213d commit 4115354
2 files changed
Lines changed: 22 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
708 | 708 | | |
709 | 709 | | |
710 | 710 | | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
711 | 731 | | |
712 | 732 | | |
713 | 733 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
518 | 518 | | |
519 | 519 | | |
520 | 520 | | |
| 521 | + | |
| 522 | + | |
521 | 523 | | |
522 | 524 | | |
523 | 525 | | |
| |||
610 | 612 | | |
611 | 613 | | |
612 | 614 | | |
613 | | - | |
614 | | - | |
615 | | - | |
616 | 615 | | |
617 | 616 | | |
618 | 617 | | |
| |||
0 commit comments