[caffe2] SWA operator#34394
Conversation
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
💊 CircleCI build failures summary and remediationsAs of commit c928ff5 (more details on the Dr. CI page): ✅ None of the build failures appear to be your fault 💚
🚧 4 upstream failures:These were probably caused by upstream breakages:
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker. This comment has been revised 87 times. |
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
bd7940a to
f724dc7
Compare
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
f724dc7 to
694033f
Compare
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
694033f to
cb31d22
Compare
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
cb31d22 to
2f779ec
Compare
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
2f779ec to
7e73b9c
Compare
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
7e73b9c to
199b85f
Compare
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
199b85f to
b28139a
Compare
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
b28139a to
d301003
Compare
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
d301003 to
afd5916
Compare
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
afd5916 to
f659430
Compare
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
Summary: Pull Request resolved: pytorch#34394 # SWA operator In this diff, we added a new operator `SWA` which will be used in `AdaGradOptimizer`. The algorithm looks like: {F230902995} # Background In our testings, we found that this operator could improve our models' reproducibility a lot. (KT: 0.86 -> .92) So we hope to land this operator and in future, enable this by default in our Models. Test Plan: Local build `aml.dper3:30f068668cfb408fbb40141fb17129f2` and bento kernel. - Local test: n215857 - f174600345 Differential Revision: D20165239 fbshipit-source-id: 554af4d51fb682ee9d1e8732943307cf0610ff1c
f659430 to
c928ff5
Compare
|
This pull request was exported from Phabricator. Differential Revision: D20165239 |
|
This pull request has been merged in e327255. |
Summary: Pull Request resolved: pytorch#34394 # SWA operator In this diff, we added a new operator `SWA` which will be used in `AdaGradOptimizer`. The algorithm looks like: {F230902995} # Background In our testings, we found that this operator could improve our models' reproducibility a lot. (KT: 0.86 -> .92) So we hope to land this operator and in future, enable this by default in our Models. Test Plan: Local build `aml.dper3:30f068668cfb408fbb40141fb17129f2` and bento kernel. - Local test: n215857 - f174600345 Reviewed By: chocjy Differential Revision: D20165239 fbshipit-source-id: c03cdd048cb10b091e5f06323f4c0f3999f95d8a
Summary:
SWA operator
In this diff, we added a new operator
SWAwhich will be used inAdaGradOptimizer.The algorithm looks like:
{F230902995}
Background
In our testings, we found that this operator could improve our models' reproducibility a lot. (KT: 0.86 -> .92)
So we hope to land this operator and in future, enable this by default in our Models.
Test Plan:
Local build
aml.dper3 pkg(4db9060d598a49e28ca983e6aaac4a3d) andDper3bento kernel.FBPKG:
aml.dper3:4db9060d598a49e28ca983e6aaac4a3daml.dper3:abaf67ef14fd41d68147ee4060485f61Differential Revision: D20165239