Add torch._dynamo.is_fullgraph_compiling to allow different codepath depending on fullgraph tracing #120400
Add torch._dynamo.is_fullgraph_compiling to allow different codepath depending on fullgraph tracing #120400fxmarty wants to merge 7 commits intopytorch:mainfrom
torch._dynamo.is_fullgraph_compiling to allow different codepath depending on fullgraph tracing #120400Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120400
Note: Links to docs will display an error until the docs builds have been completed. ❌ 10 New FailuresAs of commit 86c5756 with merge base 8a32a07 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Please seek CI approval before scheduling CIFlow labels |
|
Please seek CI approval before scheduling CIFlow labels |
| # See: https://github.com/pytorch/pytorch/issues/110765 | ||
| tx.mark_inconsistent_side_effects() |
There was a problem hiding this comment.
@jon-chuang I am not sure whether this is necessary here?
There was a problem hiding this comment.
It's necessary for capturing side-effect only code. If your code has torch operation on tensor input, it makes no difference.
test/dynamo/test_misc.py
Outdated
| opt_f = torch.compile(f, fullgraph=True) | ||
|
|
||
| self.assertEqual(f(), torch.zeros(2, 2)) | ||
| self.assertEqual(opt_f(), torch.ones(2, 2)) | ||
|
|
||
| opt_g = torch.compile(g, fullgraph=False) | ||
|
|
||
| self.assertEqual(g(), torch.zeros(2, 2)) | ||
| self.assertEqual(opt_g(), torch.zeros(2, 2)) |
There was a problem hiding this comment.
One issue here is that if calling twice the same:
opt_f = torch.compile(f, fullgraph=False)
f()
opt_f = torch.compile(f, fullgraph=True)
f()somehow torch.compile does not initialize new InstructionTranslator at the second call, so the one_graph attribute is wrong, see https://app.slack.com/client/T2077MDKQ/C033H6DJSJU
Is it legal to call torch.compile several times on the same python object/function?
There was a problem hiding this comment.
I think if your original graph was a single graph, you don't recompile for second invocation
There was a problem hiding this comment.
You may or may not need to change this behaviour; it's an easy optimization that held previously.
If this PR proposes to capture different graphs based on presence of this function call, then you can make the behaviour stricter - i.e. cause fullgraph flag changing to always recompile - only when your new function call is present; see guards for fullgraph flag.
There was a problem hiding this comment.
If this PR proposes to capture different graphs based on presence of this function call
Yes, this PR proposes to capture different graphs based on torch._dynamo.is_fullgraph_compiling() (i.e. based on fullgraph argument). This is working well in case we don't call multiple times torch.compile on the same function/object, but currently does not work when calling successively torch.compile on the same function/object due to InstructionTranslator not being re-initialized.
then you can make the behaviour stricter - i.e. cause fullgraph flag changing to always recompile - only when your new function call is present; see guards for fullgraph flag.
Could you point me out to the file responsible for that? I could not find nopython, one_graph, fullgraph references in guards.py or mutation_guard.py
There was a problem hiding this comment.
It may have been removed unfortunately, a bunch of code got reverted due to some meta internal failures (the investigation hasn't concluded afaik).
You may have to introduce a new guard that recompiles when the fullgraph flag changes, if this behaviour is strictly necessary.
|
@janeyx99 This looks like it would be useful to automatically set the |
|
Please seek CI approval before scheduling CIFlow labels |
|
Please seek CI approval before scheduling CIFlow labels |
well we want capturable to be enabled for non fullgraph tracing too. so this shouldn’t change how that logic is currently set up. |
|
Please seek CI approval before scheduling CIFlow labels |
| or current_backend == cached_backends.get(backend_obj_id, None) | ||
| ) | ||
|
|
||
| def check_nopython(ref_nopython: bool): |
There was a problem hiding this comment.
The changes in eval_frame.py are probably very far from the best, but it works.
This check_nopython is actually called twice it seems at each guard check, not sure why:
--------------- f fullgraph=True
--------------- fullgraph=False
guarded_backend_cache.nopython False
ref_nopython True
guarded_backend_cache.nopython False
ref_nopython True
--------------- f fullgraph=True
call forward
guarded_backend_cache.nopython True
ref_nopython False
guarded_backend_cache.nopython True
ref_nopython True
--------------- fullgraph=False
guarded_backend_cache.nopython False
ref_nopython True
guarded_backend_cache.nopython False
ref_nopython False
--------------- fullgraph=True
guarded_backend_cache.nopython True
ref_nopython False
guarded_backend_cache.nopython True
ref_nopython True
|
Please seek CI approval before scheduling CIFlow labels |
|
Some tests were not passing in the CI but do pass locally, not sure why: There was as welll but I did not compile with xnnpack so not sure if this one passes locally. |
|
Please seek CI approval before scheduling CIFlow labels |
|
Hi, I'll be away for two weeks from next week, happy to do modifications here by Friday. It would be cool to have this in 2.3. |
|
any update @anijain2305? Having a |
|
@fxmarty Can you share why you need this? The changes required for this are quite awkward. Wondering if you have a specific problem that we can fix so that this flag is not needed. |
|
Three use cases where I would have wanted this:
Basically, I would want to have different execution path depending on fullgraph and/or export. |
|
any thoughts @anijain2305? |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
|
any interest? |
* update non-causal mask for sdpa * add test * update docstrings * add one more test * fix cross attention bug * gentler atol/rtol
This PR fixes https://pytorch.slack.com/archives/C033H6DJSJU/p1708510833453919 & allows to implement different code path depending on whether
torch.compileis called with the argumentfullgraph=True.Example (see also the unit test):
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @aakhundov