Skip to content

[pytorch][mobile] fixed AutoGradMode/AutoNonVariableTypeMode uses for mobile callsites#34958

Closed
ljk53 wants to merge 1 commit intogh/ljk53/125/basefrom
gh/ljk53/125/head
Closed

[pytorch][mobile] fixed AutoGradMode/AutoNonVariableTypeMode uses for mobile callsites#34958
ljk53 wants to merge 1 commit intogh/ljk53/125/basefrom
gh/ljk53/125/head

Conversation

@ljk53
Copy link
Copy Markdown
Contributor

@ljk53 ljk53 commented Mar 18, 2020

Stack from ghstack:

There are three guards related to mobile build:

  • AutoGradMode
  • AutoNonVariableTypeMode
  • GraphOptimizerEnabledGuard

Today we need set some of these guards before calling libtorch APIs because we customized mobile build to only support inference (for both OSS and most FB use cases) to optimize binary size.

Several changes were made since 1.3 release so there are already inconsistent uses of these guards in the codebase. I did a sweep of all mobile related model loading & forward() call sites, trying to unify the use of these guards:

Full JIT: still set all three guards. More specifically:

  • OSS: Fixed a bug of not setting the guard at model load time correctly in Android JNI.
  • FB: Not covered by this diff (as we are using mobile interpreter for most internal builds).

Lite JIT (mobile interpreter): only needs AutoNonVariableTypeMode guard. AutoGradMode doesn't seem to be relevant (so removed from a few places) and GraphOptimizerEnabledGuard definitely not relevant (only full JIT has graph optimizer). More specifically:

  • OSS: At this point we are not committed to support Lite-JIT. For Android it shares the same code with FB JNI callsites.
  • FB:
    ** JNI callsites: Use the unified LiteJITCallGuard.
    ** For iOS/C++: manually set AutoNonVariableTypeMode for _load_for_mobile() & forward() callsites.

Ideally we should avoid having to set AutoNonVariableTypeMode for mobile interpreter. It's currently needed for dynamic dispatch + inference-only mobile build (where variable kernels are not registered) - without the guard it will try to run variable_fallback_kernel and crash (PR #34038). The proper fix will take some time so using this workaround to unblock selective BUCK build which depends on dynamic dispatch.

PS. The current status (of having to set AutoNonVariableTypeMode) should not block running FL model + mobile interpreter - if all necessary variable kernels are registered then it can call _load_for_mobile()/forward() against the FL model without setting the AutoNonVariableTypeMode guard. It's still inconvenient for JAVA callsites as it's set unconditionally inside JNI methods.

Differential Revision: D20498017

NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on Phabricator!

… mobile callsites

There are three guards related to mobile build:
* AutoGradMode
* AutoNonVariableTypeMode
* GraphOptimizerEnabledGuard

Today we need set some of these guards before calling libtorch APIs because we customized mobile build to only support inference (for both OSS and most FB use cases) to optimize binary size.

Several changes were made since 1.3 release so there are already inconsistent uses of these guards in the codebase. I did a sweep of all mobile related model loading & forward() call sites, trying to unify the use of these guards:

Full JIT: still set all three guards. More specifically:
* OSS: Fixed a bug of not setting the guard at model load time correctly in Android JNI.
* FB: Not covered by this diff (as we are using mobile interpreter for most internal builds).

Lite JIT (mobile interpreter): only needs AutoNonVariableTypeMode guard. AutoGradMode doesn't seem to be relevant (so removed from a few places) and GraphOptimizerEnabledGuard definitely not relevant (only full JIT has graph optimizer). More specifically:
* OSS: At this point we are not committed to support Lite-JIT. For Android it shares the same code with FB JNI callsites.
* FB:
** JNI callsites: Use the unified LiteJITCallGuard.
** For iOS/C++: manually set AutoNonVariableTypeMode for _load_for_mobile() & forward() callsites.

Ideally we should avoid having to set AutoNonVariableTypeMode for mobile interpreter. It's currently needed for dynamic dispatch + inference-only mobile build (where variable kernels are not registered) - without the guard it will try to run `variable_fallback_kernel` and crash (PR #34038). The proper fix will take some time so using this workaround to unblock selective BUCK build which depends on dynamic dispatch.

PS. The current status (of having to set AutoNonVariableTypeMode) should not block running FL model + mobile interpreter - if all necessary variable kernels are registered then it can call _load_for_mobile()/forward() against the FL model without setting the AutoNonVariableTypeMode guard. It's still inconvenient for JAVA callsites as it's set unconditionally inside JNI methods.

Differential Revision: [D20498017](https://our.internmc.facebook.com/intern/diff/D20498017/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D20498017/)!

[ghstack-poisoned]
ljk53 added a commit that referenced this pull request Mar 18, 2020
… mobile callsites

There are three guards related to mobile build:
* AutoGradMode
* AutoNonVariableTypeMode
* GraphOptimizerEnabledGuard

Today we need set some of these guards before calling libtorch APIs because we customized mobile build to only support inference (for both OSS and most FB use cases) to optimize binary size.

Several changes were made since 1.3 release so there are already inconsistent uses of these guards in the codebase. I did a sweep of all mobile related model loading & forward() call sites, trying to unify the use of these guards:

Full JIT: still set all three guards. More specifically:
* OSS: Fixed a bug of not setting the guard at model load time correctly in Android JNI.
* FB: Not covered by this diff (as we are using mobile interpreter for most internal builds).

Lite JIT (mobile interpreter): only needs AutoNonVariableTypeMode guard. AutoGradMode doesn't seem to be relevant (so removed from a few places) and GraphOptimizerEnabledGuard definitely not relevant (only full JIT has graph optimizer). More specifically:
* OSS: At this point we are not committed to support Lite-JIT. For Android it shares the same code with FB JNI callsites.
* FB:
** JNI callsites: Use the unified LiteJITCallGuard.
** For iOS/C++: manually set AutoNonVariableTypeMode for _load_for_mobile() & forward() callsites.

Ideally we should avoid having to set AutoNonVariableTypeMode for mobile interpreter. It's currently needed for dynamic dispatch + inference-only mobile build (where variable kernels are not registered) - without the guard it will try to run `variable_fallback_kernel` and crash (PR #34038). The proper fix will take some time so using this workaround to unblock selective BUCK build which depends on dynamic dispatch.

PS. The current status (of having to set AutoNonVariableTypeMode) should not block running FL model + mobile interpreter - if all necessary variable kernels are registered then it can call _load_for_mobile()/forward() against the FL model without setting the AutoNonVariableTypeMode guard. It's still inconvenient for JAVA callsites as it's set unconditionally inside JNI methods.

Differential Revision: [D20498017](https://our.internmc.facebook.com/intern/diff/D20498017/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D20498017/)!

ghstack-source-id: 100385440
Pull Request resolved: #34958
@ljk53 ljk53 requested review from dreiss, ezyang and xcheng16 and removed request for dreiss March 18, 2020 16:26
@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Mar 18, 2020

💊 CircleCI build failures summary and remediations

As of commit 2a52b62 (more details on the Dr. CI page):


None of the build failures appear to be your fault 💚


  • 1/1 broken upstream at merge base d927d58 on Mar 18 from 2:35pm to 7:57pm (10 commits; d927d58 - c747f09)

    Please rebase on the viable/strict branch (expand for instructions)

    If your commit is newer than viable/strict, you can try basing on an older, stable commit:

    git fetch https://github.com/pytorch/pytorch viable/strict
    git rebase --onto FETCH_HEAD $(git merge-base origin/master HEAD)
    

    If your commit is older than viable/strict:

    git fetch https://github.com/pytorch/pytorch viable/strict
    git rebase FETCH_HEAD
    

    Check out the recency history of this "viable master" tracking branch.


🚧 1 upstream failure:

These were probably caused by upstream breakages:


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 2 times.

@ljk53
Copy link
Copy Markdown
Contributor Author

ljk53 commented Mar 19, 2020

This has been merged in 6e47e7b

@ljk53 ljk53 closed this Mar 19, 2020
@facebook-github-bot facebook-github-bot deleted the gh/ljk53/125/head branch April 18, 2020 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant