[dynamo] disable dynamo recursively on compiled code if fullgraph=True#172295
[dynamo] disable dynamo recursively on compiled code if fullgraph=True#172295williamwen42 wants to merge 3 commits intogh/williamwen42/379/basefrom
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/172295
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit ad2a003 with merge base 8f775e9 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…llgraph=True" cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo [ghstack-poisoned]
…llgraph=True" cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo [ghstack-poisoned]
|
I am wondering if we can fix this more simply by simply bypassing the eval frame handler entirely for fullgraph. We need the eval frame handler to conveniently reenter Dynamo when graph break occurs, but this definitionally cannot happen for fullgraph. So an eval frame handler isn't necessary; we can jump straight to the cache lookup code. Perhaps... fullgraph should also imply you never go to eager? (Not sure what you're going to do when you hit the recompile limit; it's not clear what you do in this PR either.) |
|
Recompile limit in fullgraph=True would result in an error being raised in convert_frame.py (no frame actions set). I needed to do this per-frame action thing for the case that we wrap the same code object with both a torch.compile(fullgraph=False) and torch.compile(fullgraph=True) decorator at different points in the same scope. Testing files that wrap over external_utils.wrap_inline are prone to doing this. I could possibly simplify this handling by introducing a new fullgraph=True eval frame handler that will not set a custom eval frame handler recursively - we could fallback to eager or even error out if we're not expecting to handle additional frames. This simplifies logic since fullgraph setting is determined in eval_frame.py. In the current implementation, we read the fullgraph value in convert_frame.py and we need to "tell" eval_frame_cpp.cpp what the fullgraph setting was. |
Blah, this is annoying. |
|
Don't block on me, I'm going to rely on Animesh's judgment on this |
…llgraph=True using eval frame overrides" This is attempt #2 at #172295. Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are: - skip tracing frames - error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though. Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo mlazos [ghstack-poisoned]
…llgraph=True using eval frame overrides" This is attempt #2 at #172295. Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are: - skip tracing frames - error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though. Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo mlazos [ghstack-poisoned]
…llgraph=True using eval frame overrides" This is attempt #2 at #172295. Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are: - skip tracing frames - error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though. Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo mlazos [ghstack-poisoned]
|
We are preferring the 2nd approach |
…piled code if fullgraph=True using eval frame overrides" This is attempt #2 at #172295. Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are: - skip tracing frames - error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though. Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo mlazos [ghstack-poisoned]
…llgraph=True using eval frame overrides" This is attempt #2 at #172295. Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are: - skip tracing frames - error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though. Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo mlazos [ghstack-poisoned]
…e using eval frame overrides (#173080) This is attempt #2 at #172295. Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are: - skip tracing frames - error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though. Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True. Pull Request resolved: #173080 Approved by: https://github.com/anijain2305
…e using eval frame overrides (pytorch#173080) This is attempt pytorch#2 at pytorch#172295. Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are: - skip tracing frames - error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though. Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True. Pull Request resolved: pytorch#173080 Approved by: https://github.com/anijain2305
…e using eval frame overrides (pytorch#173080) This is attempt pytorch#2 at pytorch#172295. Instead of using frame/CacheEntry recursive actions, we override the recursive eval_frame callback in eval_frame_cpp.cpp if we run custom code. The override is set in eval_frame.py only if fullgraph=True. Right now, the possible overrides are: - skip tracing frames - error if tracing is successful. We can't just error on entry since convert_frame might end up skipping the frame, which should be allowed. It might be better to simply patch the entrypoint of tracing e.g. InstructionTranslatorBase, with an error though. Currently, the default behavior is the latter, since we should loudly fail if torch.compile gets unintentionally re-invoked when fullgraph=True. Pull Request resolved: pytorch#173080 Approved by: https://github.com/anijain2305
Stack from ghstack (oldest at bottom):
I tried a few 1-line changes but none were sufficient to get the behavior we wanted. The issue is that we want to control the recursive eval_frame behavior depending on current top-level compilation call (i.e. we need different recursive_action for each CacheEntry). This was not supported, so we implement per-CacheEntry recursive_action here.
There are some rules we introduced for dealing with ExtraState (code-level) vs. CacheEntry (frame-level) recursive_action:
This is potentially BC-breaking since
torch.compilewill not be invoked again on code compiled withfullgraph=True, but it is unlikely for anyone to really be depending on this behavior (see the added test, which does some really weird behavior in order to invoke a nestedtorch.compilewhenfullgraph=True).cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @kadeng @chauhang @amjames @Lucaskabela @jataylo @mlazos