Fix Dynamo lru_cache warnings during torch.compile#13384
Fix Dynamo lru_cache warnings during torch.compile#13384sayakpaul merged 4 commits intohuggingface:mainfrom
lru_cache warnings during torch.compile#13384Conversation
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
|
Hi @sayakpaul . Would you please review this PR? Thanks! |
| "_parallel_config": parallel_config, | ||
| } | ||
| if is_torch_version(">=", "2.5.0"): | ||
| if _CAN_USE_FLEX_ATTN: |
There was a problem hiding this comment.
Is this a safe replacement? If so, could you elaborate further?
There was a problem hiding this comment.
There was a problem hiding this comment.
Added comments for it.
sayakpaul
left a comment
There was a problem hiding this comment.
Thanks for the PR. Left one comment.
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Hi @sayakpaul . These failures are unrelated to this PR. They are caused by a missing key in peft==0.18.2.dev0's _MOE_TARGET_MODULE_MAPPING ('llava', 'qwen2_vl'), which is a pre-existing issue in the PEFT dev build. My changes only touch |
sayakpaul
left a comment
There was a problem hiding this comment.
Thanks for the PR! Failing test is unrelated.
…3384) * fix compile issue Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * compile friendly Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add comments Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
What does this PR do?
Fixes Dynamo
lru_cachewarnings when usingtorch.compileon diffusion pipelines. Two changes:attention_dispatch.py:dispatch_attention_fncallsis_torch_version(">=", "2.5.0")at runtime, which is@lru_cache-wrapped. Replace with the existing module-level constant_CAN_USE_FLEX_ATTNso Dynamo never traces into it.torch_utils.py:lru_cache_unless_exportonly bypasseslru_cacheduringtorch.export(is_exporting). Addis_compilingcheck sotorch.compilealso bypasses the cache wrapper.Reproduce
out before fix:
out after fix: