Skip to content

[caffe2] SWA operator#34394

Closed
pcpLiu wants to merge 1 commit intopytorch:masterfrom
pcpLiu:export-D20165239
Closed

[caffe2] SWA operator#34394
pcpLiu wants to merge 1 commit intopytorch:masterfrom
pcpLiu:export-D20165239

Conversation

@pcpLiu
Copy link
Copy Markdown

@pcpLiu pcpLiu commented Mar 6, 2020

Summary:

SWA operator

In this diff, we added a new operator SWA which will be used in AdaGradOptimizer.

The algorithm looks like:

{F230902995}

Background

In our testings, we found that this operator could improve our models' reproducibility a lot. (KT: 0.86 -> .92)

So we hope to land this operator and in future, enable this by default in our Models.

Test Plan:
Local build aml.dper3 pkg(4db9060d598a49e28ca983e6aaac4a3d) and Dper3 bento kernel.

  • Local test: n213184
  • Testing flow:
    • V8: f173326189
    • V4: f173097225

FBPKG:

  • v8: aml.dper3:4db9060d598a49e28ca983e6aaac4a3d
  • v4: aml.dper3:abaf67ef14fd41d68147ee4060485f61

Differential Revision: D20165239

@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@pcpLiu pcpLiu force-pushed the export-D20165239 branch from 337ebbb to 80926ce Compare March 6, 2020 20:56
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Mar 6, 2020

💊 CircleCI build failures summary and remediations

As of commit c928ff5 (more details on the Dr. CI page):


None of the build failures appear to be your fault 💚


  • 4/4 broken upstream at merge base eef17ed since Mar 19

    Please rebase on the viable/strict branch (expand for instructions)

    If your commit is newer than viable/strict, you can try basing on an older, stable commit:

    git fetch https://github.com/pytorch/pytorch viable/strict
    git rebase --onto FETCH_HEAD $(git merge-base origin/master HEAD)
    

    If your commit is older than viable/strict:

    git fetch https://github.com/pytorch/pytorch viable/strict
    git rebase FETCH_HEAD
    

    Check out the recency history of this "viable master" tracking branch.


🚧 4 upstream failures:

These were probably caused by upstream breakages:


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 87 times.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@pcpLiu pcpLiu force-pushed the export-D20165239 branch from 80926ce to f853c4f Compare March 7, 2020 15:53
@pcpLiu pcpLiu force-pushed the export-D20165239 branch from f853c4f to bd7940a Compare March 9, 2020 16:50
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@pcpLiu pcpLiu force-pushed the export-D20165239 branch from bd7940a to f724dc7 Compare March 11, 2020 18:09
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@pcpLiu pcpLiu force-pushed the export-D20165239 branch from f724dc7 to 694033f Compare March 11, 2020 18:10
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@pcpLiu pcpLiu force-pushed the export-D20165239 branch from 694033f to cb31d22 Compare March 12, 2020 16:12
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@pcpLiu pcpLiu force-pushed the export-D20165239 branch from cb31d22 to 2f779ec Compare March 13, 2020 18:10
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@pcpLiu pcpLiu force-pushed the export-D20165239 branch from 2f779ec to 7e73b9c Compare March 16, 2020 17:37
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@pcpLiu pcpLiu force-pushed the export-D20165239 branch from 7e73b9c to 199b85f Compare March 16, 2020 17:57
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@pcpLiu pcpLiu force-pushed the export-D20165239 branch from 199b85f to b28139a Compare March 16, 2020 18:53
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@pcpLiu pcpLiu force-pushed the export-D20165239 branch from b28139a to d301003 Compare March 16, 2020 20:04
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@pcpLiu pcpLiu force-pushed the export-D20165239 branch from d301003 to afd5916 Compare March 16, 2020 23:16
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@pcpLiu pcpLiu force-pushed the export-D20165239 branch from afd5916 to f659430 Compare March 17, 2020 16:08
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

Summary:
Pull Request resolved: pytorch#34394

# SWA operator
In this diff, we added a new operator `SWA` which will be used in `AdaGradOptimizer`.

The algorithm looks like:

{F230902995}

# Background

In our testings, we found that this operator could improve our models' reproducibility a lot. (KT: 0.86 -> .92)

So we hope to land this operator and in future, enable this by default in our Models.

Test Plan:
Local build `aml.dper3:30f068668cfb408fbb40141fb17129f2` and bento kernel.
- Local test: n215857
- f174600345

Differential Revision: D20165239

fbshipit-source-id: 554af4d51fb682ee9d1e8732943307cf0610ff1c
@pcpLiu pcpLiu force-pushed the export-D20165239 branch from f659430 to c928ff5 Compare March 19, 2020 03:01
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D20165239

@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request has been merged in e327255.

laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
Pull Request resolved: pytorch#34394

# SWA operator
In this diff, we added a new operator `SWA` which will be used in `AdaGradOptimizer`.

The algorithm looks like:

{F230902995}

# Background

In our testings, we found that this operator could improve our models' reproducibility a lot. (KT: 0.86 -> .92)

So we hope to land this operator and in future, enable this by default in our Models.

Test Plan:
Local build `aml.dper3:30f068668cfb408fbb40141fb17129f2` and bento kernel.
- Local test: n215857
- f174600345

Reviewed By: chocjy

Differential Revision: D20165239

fbshipit-source-id: c03cdd048cb10b091e5f06323f4c0f3999f95d8a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants