[Falcon] Set use_cache=False before creating presents which relies on use_cache#26328
[Falcon] Set use_cache=False before creating presents which relies on use_cache#26328LysandreJik merged 4 commits intohuggingface:mainfrom yundai424:main
use_cache=False before creating presents which relies on use_cache#26328Conversation
presents=None when use_cache is set to False for activation ckptpresents=None when use_cache is set to False for activation ckpt
ArthurZucker
left a comment
There was a problem hiding this comment.
That's a good catch!
| "`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`..." | ||
| ) | ||
| use_cache = False | ||
| presents = None |
There was a problem hiding this comment.
This should be done outside the loop and similar to this:
if self.gradient_checkpointing and self.training:
if use_cache:
logger.warning_once(
"`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`..."
)
use_cache = False
present_key_values = () if use_cache else None
all_self_attentions = () if output_attentions else None
all_hidden_states = () if output_hidden_states else Nonepresents=None when use_cache is set to False for activation ckptuse_cache=False before creating presents which relies on use_cache
younesbelkada
left a comment
There was a problem hiding this comment.
Great catch ! Can you run the styling checks?
make fixupThen we can merge I think
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
|
Hi @yundai424 thanks a lot for iterating, in order to move forward with the PR could you merge your branch with main branch? |
|
Hi @younesbelkada are you referring to merging to HF main? 🤔 |
|
Hi @yundai424 git fetch upstream
git merge upstream/main
git push |
|
oh cool i see what you mean.. merged, thanks! @younesbelkada |
younesbelkada
left a comment
There was a problem hiding this comment.
Clean to me, thanks!
LysandreJik
left a comment
There was a problem hiding this comment.
Great, thanks @yundai424!
What does this PR do?
Fixes #26327
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.