[Bugfix] Fix nll bug via decomposition handling by Lucaskabela · Pull Request #177189 · pytorch/pytorch

Lucaskabela · 2026-03-11T21:03:32Z

Summary

NLL backward was failing under compile with 1d input when self is 1D, where target.unsqueeze(0) produced
a 2D index for the subsequent 1D scatter. The fix mirrors the C++ kernel's behavior of only using
target[0]

Test

python -m pytest test/inductor/test_torchinductor.py -xvs -k "test_nll_loss_backward_1d_input"

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo

pytorch-bot · 2026-03-11T21:03:37Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/177189

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Unrelated Failures

As of commit 9ae0c7d with merge base f249065 ():

NEW FAILURE - The following job has failed:

pull / linux-jammy-py3.14-clang15 / test (crossref, 1, 2, lf.linux.2xlarge) (gh)
Process completed with exit code 137.

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / linux-jammy-py3.10-gcc11 / test (distributed, 1, 2, lf.linux.2xlarge) (gh) (similar failure)
test/distributed/tensor/test_dtensor_ops.py::TestLocalDTensorOpsCPU::test_dtensor_op_db_nanmean_cpu_float32

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 1, 2, linux.2xlarge.amx, unstable) (gh) (#174929)
detectron2_maskrcnn_r_50_fpn

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot · 2026-03-11T21:03:40Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Lucaskabela · 2026-03-11T23:06:53Z

@claude are these test failures related to my change?

claude · 2026-03-11T23:07:19Z

Claude finished @Lucaskabela's task in 1m 13s —— View job

CI Failure Analysis

Review the PR changes to understand scope
Analyze each CI failure for relevance to the change
Provide summary

No, none of the CI failures are related to your change. Here's the breakdown:

1. `linux-jammy-py3.14-clang15 / test (crossref, 1, 2)` — Not related

This job was killed with exit code 137 (SIGKILL), which is the OOM killer. The logs show it was running test_fx.py when the process was killed — a test file completely unrelated to NLL loss or decompositions. This is a resource/memory issue on the CI runner, not a test failure caused by your code.

2. `test_dtensor_op_db_nanmean_cpu_float32` — Not related (flaky)

Already marked as flaky by the CI bot, with a similar failure on trunk. This is a distributed tensor test unrelated to NLL loss.

3. `detectron2_maskrcnn_r_50_fpn` — Not related (unstable)

Marked as unstable and linked to existing issue #174929. This is an inductor benchmark flake, not related to your change.

Your change only touches torch/_decomp/decompositions.py (6 lines) and adds a test in test/inductor/test_torchinductor.py. None of the failures involve these files or the NLL loss code path.

torch/_decomp/decompositions.py

Lucaskabela · 2026-03-12T16:16:01Z

@pytorchbot merge

pytorchmergebot · 2026-03-12T16:18:19Z

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team

Raised by workflow job

pytorch-bot · 2026-03-12T16:18:25Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Lucaskabela · 2026-03-12T16:21:04Z

@pytorchbot label "topic: not user facing"

Lucaskabela · 2026-03-12T16:21:12Z

@pytorchbot merge -i

pytorchmergebot · 2026-03-12T16:23:26Z

Merge started

Your change will be merged while ignoring the following 3 checks: pull / linux-jammy-py3.14-clang15 / test (crossref, 1, 2, lf.linux.2xlarge), pull / linux-jammy-py3.10-gcc11 / test (distributed, 1, 2, lf.linux.2xlarge), inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 1, 2, linux.2xlarge.amx, unstable)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Fixes pytorch#89630 ## Summary NLL backward was failing under compile with 1d input when self is 1D, where target.unsqueeze(0) produced a 2D index for the subsequent 1D scatter. The fix mirrors the C++ kernel's behavior of only using target[0] ## Test ```bash python -m pytest test/inductor/test_torchinductor.py -xvs -k "test_nll_loss_backward_1d_input" ``` Pull Request resolved: pytorch#177189 Approved by: https://github.com/frgossen

pytorch-bot bot added ciflow/inductor module: inductor labels Mar 11, 2026

Fix nll bug via decomposition handling

9ae0c7d

Lucaskabela force-pushed the lucaskabela/nll_bugfix branch from b968311 to 9ae0c7d Compare March 11, 2026 21:40

Lucaskabela marked this pull request as ready for review March 11, 2026 21:40

Lucaskabela requested review from anijain2305, aorenste and frgossen March 11, 2026 21:40

Lucaskabela requested a review from azahed98 March 12, 2026 00:16

frgossen reviewed Mar 12, 2026

View reviewed changes

torch/_decomp/decompositions.py Show resolved Hide resolved

Lucaskabela requested a review from frgossen March 12, 2026 15:58

frgossen approved these changes Mar 12, 2026

View reviewed changes

torch/_decomp/decompositions.py Show resolved Hide resolved

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 12, 2026

pytorchmergebot added the merging label Mar 12, 2026

pytorchmergebot removed the merging label Mar 12, 2026

pytorch-bot bot added the topic: not user facing topic category label Mar 12, 2026

pytorchmergebot added the merging label Mar 12, 2026

pytorchmergebot added the Merged label Mar 12, 2026

pytorchmergebot closed this in d9d7c0b Mar 12, 2026

pytorchmergebot removed the merging label Mar 12, 2026

github-actions bot deleted the lucaskabela/nll_bugfix branch April 12, 2026 02:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Fix nll bug via decomposition handling#177189

[Bugfix] Fix nll bug via decomposition handling#177189
Lucaskabela wants to merge 1 commit intomainfrom
lucaskabela/nll_bugfix

Lucaskabela commented Mar 11, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 11, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 11, 2026

Uh oh!

Lucaskabela commented Mar 11, 2026

Uh oh!

claude bot commented Mar 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Lucaskabela commented Mar 12, 2026

Uh oh!

pytorchmergebot commented Mar 12, 2026

Uh oh!

pytorch-bot bot commented Mar 12, 2026

Uh oh!

Lucaskabela commented Mar 12, 2026

Uh oh!

Lucaskabela commented Mar 12, 2026

Uh oh!

pytorchmergebot commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Lucaskabela commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test

Uh oh!

pytorch-bot bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/177189

❌ 1 New Failure, 2 Unrelated Failures

Uh oh!

pytorch-bot bot commented Mar 11, 2026

This PR needs a release notes: label

Uh oh!

Lucaskabela commented Mar 11, 2026

Uh oh!

claude bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI Failure Analysis

1. linux-jammy-py3.14-clang15 / test (crossref, 1, 2) — Not related

2. test_dtensor_op_db_nanmean_cpu_float32 — Not related (flaky)

3. detectron2_maskrcnn_r_50_fpn — Not related (unstable)

Uh oh!

Uh oh!

Uh oh!

Lucaskabela commented Mar 12, 2026

Uh oh!

pytorchmergebot commented Mar 12, 2026

Merge failed

Uh oh!

pytorch-bot bot commented Mar 12, 2026

This PR needs a release notes: label

Uh oh!

Lucaskabela commented Mar 12, 2026

Uh oh!

Lucaskabela commented Mar 12, 2026

Uh oh!

pytorchmergebot commented Mar 12, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Lucaskabela commented Mar 11, 2026 •

edited

Loading

pytorch-bot bot commented Mar 11, 2026 •

edited

Loading

This PR needs a `release notes:` label

claude bot commented Mar 11, 2026 •

edited

Loading

1. `linux-jammy-py3.14-clang15 / test (crossref, 1, 2)` — Not related

2. `test_dtensor_op_db_nanmean_cpu_float32` — Not related (flaky)

3. `detectron2_maskrcnn_r_50_fpn` — Not related (unstable)

This PR needs a `release notes:` label