Skip to content

[DTensor] Handle NaN outputs in strategy validator#174539

Closed
wconstab wants to merge 5 commits intogh/wconstab/526/basefrom
gh/wconstab/526/head
Closed

[DTensor] Handle NaN outputs in strategy validator#174539
wconstab wants to merge 5 commits intogh/wconstab/526/basefrom
gh/wconstab/526/head

Conversation

@wconstab
Copy link
Copy Markdown
Contributor

@wconstab wconstab commented Feb 8, 2026

Stack from ghstack (oldest at bottom):

Use equal_nan=True in torch.allclose comparison so that NaN == NaN is
considered valid. Also skip samples with all-NaN ground truth (like the
existing all-zero skip) since NaN is invariant under all reduce ops.

This fixes false positive "incorrect" reports for igamma/igammac, whose
OpInfo samples include negative inputs that produce all-NaN outputs.

Authored with Claude.

Use equal_nan=True in torch.allclose comparison so that NaN == NaN is
considered valid. Also skip samples with all-NaN ground truth (like the
existing all-zero skip) since NaN is invariant under all reduce ops.

This fixes false positive "incorrect" reports for igamma/igammac, whose
OpInfo samples include negative inputs that produce all-NaN outputs.

Authored with Claude.

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Feb 8, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/174539

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 36 Pending

As of commit c48457f with merge base f365425 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot Bot added the release notes: distributed (dtensor) release notes category label Feb 8, 2026
wconstab added a commit that referenced this pull request Feb 8, 2026
Use equal_nan=True in torch.allclose comparison so that NaN == NaN is
considered valid. Also skip samples with all-NaN ground truth (like the
existing all-zero skip) since NaN is invariant under all reduce ops.

This fixes false positive "incorrect" reports for igamma/igammac, whose
OpInfo samples include negative inputs that produce all-NaN outputs.

Authored with Claude.

ghstack-source-id: b4de8b9
Pull Request resolved: #174539
Use equal_nan=True in torch.allclose comparison so that NaN == NaN is
considered valid. Also skip samples with all-NaN ground truth (like the
existing all-zero skip) since NaN is invariant under all reduce ops.

This fixes false positive "incorrect" reports for igamma/igammac, whose
OpInfo samples include negative inputs that produce all-NaN outputs.

Authored with Claude.

[ghstack-poisoned]
wconstab added a commit that referenced this pull request Feb 8, 2026
Use equal_nan=True in torch.allclose comparison so that NaN == NaN is
considered valid. Also skip samples with all-NaN ground truth (like the
existing all-zero skip) since NaN is invariant under all reduce ops.

This fixes false positive "incorrect" reports for igamma/igammac, whose
OpInfo samples include negative inputs that produce all-NaN outputs.

Authored with Claude.

ghstack-source-id: d88dcb6
Pull Request resolved: #174539
Use equal_nan=True in torch.allclose comparison so that NaN == NaN is
considered valid. Also skip samples with all-NaN ground truth (like the
existing all-zero skip) since NaN is invariant under all reduce ops.

This fixes false positive "incorrect" reports for igamma/igammac, whose
OpInfo samples include negative inputs that produce all-NaN outputs.

Authored with Claude.

[ghstack-poisoned]
Use equal_nan=True in torch.allclose comparison so that NaN == NaN is
considered valid. Also skip samples with all-NaN ground truth (like the
existing all-zero skip) since NaN is invariant under all reduce ops.

This fixes false positive "incorrect" reports for igamma/igammac, whose
OpInfo samples include negative inputs that produce all-NaN outputs.

Authored with Claude.

[ghstack-poisoned]
Use equal_nan=True in torch.allclose comparison so that NaN == NaN is
considered valid. Also skip samples with all-NaN ground truth (like the
existing all-zero skip) since NaN is invariant under all reduce ops.

This fixes false positive "incorrect" reports for igamma/igammac, whose
OpInfo samples include negative inputs that produce all-NaN outputs.

Authored with Claude.

[ghstack-poisoned]
@wconstab
Copy link
Copy Markdown
Contributor Author

squashed

@wconstab wconstab closed this Feb 11, 2026
sandy-gags pushed a commit to sandy-gags/pytorch that referenced this pull request Mar 12, 2026
Use equal_nan=True in torch.allclose comparison so that NaN == NaN is
considered valid. Also skip samples with all-NaN ground truth (like the
existing all-zero skip) since NaN is invariant under all reduce ops.

This fixes false positive "incorrect" reports for igamma/igammac, whose
OpInfo samples include negative inputs that produce all-NaN outputs.

Authored with Claude.

ghstack-source-id: 0a11643
Pull Request resolved: pytorch/pytorch#174539
@github-actions github-actions Bot deleted the gh/wconstab/526/head branch March 14, 2026 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants