chore: [NCCL] reorg and better error messages#3338
Conversation
Signed-off-by: Naren Dasan <naren@narendasan.com>
| from torch_tensorrt.fx.types import TRTTensor | ||
|
|
||
| import tensorrt as trt | ||
|
|
There was a problem hiding this comment.
Why is this change here? Is this a pre-commit change?
|
|
||
| def tensorrt_fused_nccl_all_gather_op(args0, args1, args2): | ||
| # TODO: @apbose make these actual torch custom ops, should allow aot_export | ||
| def tensorrt_fused_nccl_all_gather_op( |
There was a problem hiding this comment.
Ok noted in the TODOs. Just to get clarity, what is meant by allowing aot_export? What is the difference in the present behavior?
There was a problem hiding this comment.
Sonthat you dont need to use the autograd workflow and could potentially support torch.export flows
| prerelease = "if-necessary-or-explicit" | ||
|
|
||
| index-strategy = "unsafe-best-match" # Needed for TRT-LLM | ||
|
|
There was a problem hiding this comment.
Why is this added? What is the significance?
There was a problem hiding this comment.
This is for uv to install Tensorrt llm
|
Wanted to get clarity on- |
| monitoring-tools = ["rich>=13.7.1"] | ||
| jupyter = ["rich[jupyter]>=13.7.1"] | ||
| distributed = ["tensorrt-llm>=0.16.0"] | ||
|
|
There was a problem hiding this comment.
How do we ensure this goes through when we have a torchTRT container having python version different than 3.10. Won't this cause an issue here?
There was a problem hiding this comment.
Its an extra so it wont effect install
There were a number of issues regarding type annotations in code added in the PR, so at some point it was pushed without verifying through precommit |
Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <naren@narendasan.com>
Description
Precommit was clearly not run, assorted fixes and reorg. Some AIs for @apbose re: custom ops and some hardcoded special cases
Fixes # (issue)
Type of change
Please delete options that are not relevant and/or add your own.
Checklist: