Skip to content

Amp gradient accumulation example#36601

Closed
mcarilli wants to merge 5 commits intopytorch:masterfrom
mcarilli:accumulation_example
Closed

Amp gradient accumulation example#36601
mcarilli wants to merge 5 commits intopytorch:masterfrom
mcarilli:accumulation_example

Conversation

@mcarilli
Copy link
Copy Markdown
Collaborator

@mcarilli mcarilli commented Apr 14, 2020

Several people have asked me about proper Amp usage with gradient accumulation. In particular, it's unclear to people that you should only call scaler.unscale_() (if desired) and scaler.update() in iterations where you actually plan to step. This PR adds a minimal accumulation example.

I built the docs locally and it looks free from sphinx errors, at least.

@mcarilli mcarilli requested a review from ngimel April 14, 2020 19:41
@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Apr 14, 2020

💊 Build failures summary and remediations

As of commit 1af3707 (more details on the Dr. CI page):


  • 1/2 failures possibly* introduced in this PR

    • 1/2 non-CircleCI failure(s)
  • 1/2 broken upstream at merge base c9a1fc2 on Apr 13 from 4:26pm to 7:02pm PDT (10 commits; c9a1fc2 - 455d4aa)

    Please rebase on the viable/strict branch (expand for instructions)

    Since your merge base is older than viable/strict, run these commands:

    git fetch https://github.com/pytorch/pytorch viable/strict
    git rebase FETCH_HEAD
    

    Check out the recency history of this "viable master" tracking branch.


🚧 1 upstream failure:

These were probably caused by upstream breakages:


Extra GitHub checks


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 14 times.

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ngimel merged this pull request in e6bc34f.

laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
Several people have asked me about proper Amp usage with gradient accumulation.  In particular, it's [unclear to people](NVIDIA/apex#439 (comment)) that you should only call `scaler.unscale_()` (if desired) and `scaler.update()` in iterations where you actually plan to step.  This PR adds a minimal accumulation example.

I built the docs locally and it looks free from sphinx errors, at least.
Pull Request resolved: pytorch#36601

Differential Revision: D21082295

Pulled By: ngimel

fbshipit-source-id: b2faa6c02b9f7e1972618a0f1d5360a03f0450ac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants