[docstring] Fix docstrings for CodeGenConfig, CodeGenTokenizer, CodeGenTokenizerFast#26821
Conversation
|
@ydshieh Hi, can you review this? |
f3ef48f to
a757614
Compare
There was a problem hiding this comment.
Looking the usage of n_ctx in the file src/transformers/models/codegen/configuration_codegen.py, I think this is not the context length. Could you double check it too 🙏 .
There was a problem hiding this comment.
Sorry, for the delayed response. Only now had some time to work on this. So, I've stepped through initialization code line by line and, if I am not completely missing something, n_ctx doesn't affect anything.
In CodeGenConfig n_ctx is only saved as an instance variable and is not present in attribute_map. In modeling_codegen.py n_ctx is only in 2 places:
- The docstring for
CodeGenAttention._merge_heads(that's what made me think thatn_ctxwas used to implicitly set the context window size) - In
CodeGenModel.__init__:self.rotary_dim = min(config.rotary_dim, config.n_ctx // config.num_attention_heads). Here it is used to setself.rotary_dimbut it is not updated in config which is used in other places. Andself.rotary_dimis not used inCodeGenModel.
To further test this I wrote a script which set n_ctx in config to a bogus value and it works fine.
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
from transformers.models.codegen import CodeGenForCausalLM
checkpoint = "Salesforce/codegen-350M-mono"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
config = AutoConfig.from_pretrained(checkpoint)
if True:
config.n_ctx = -1 # This works fine
else:
config.rotary_dim = -1 # This throws a RuntimeError during init
# import pdb; pdb.set_trace()
model = CodeGenForCausalLM(config=config)
# model = AutoModelForCausalLM.from_pretrained(checkpoint, config=config) # This behaves the same
model.transformer.rotary_dim = "a" # This does not break anything
text = "def hello_world():"
completion = model.generate(**tokenizer(text, return_tensors="pt"))
print(tokenizer.decode(completion[0]))Does this seem right to you or are there other aspects I should consider? Otherwise, what should be done with n_ctx (just write in a docstring that it's not used to not break anyone downstream who might be using it)?
There was a problem hiding this comment.
Hi @daniilgaltsev You are right. This attribute is kinda confusing. But from the two places you mentioned, it looks more like the dimension of rotary embedding. That's just make it this way despite its usage is confusing.
Thank you a lot !
There was a problem hiding this comment.
Sorry, I still don't quite understand. Are you saying that n_ctx is actually used somewhere (as rotary_dim)? Because I can't find a place where it actually affects anything except setting an unused instance variable. To me it looks more like either a bug or code that was replaced by explicitly adding rotary_dim to config and forgetting to delete the old unused code.
There was a problem hiding this comment.
No, I don't say it is really used. But from the two places
config.n_ctx // config.num_attention_heads
and
Merges attn_head_size dim and num_attn_heads dim into n_ctx
it looks like n_ctx means the dimension of something rather than Number of tokens provided in a single pass. (sequence length), despite the name suggests it is like a sequence.
There was a problem hiding this comment.
Ok, I see. But what should we put into its description? Since it does not affect anything, it would be misleading to indicate otherwise, because people would expect to see an effect.
There was a problem hiding this comment.
Just put
This attribute is used in `CodeGenModel.__init__` without any real effect.
We can provide an warning later in another PR. Let's move on to merge this.
There was a problem hiding this comment.
Ok, I've pushed the updated n_ctx description. Thank you for the help.
ydshieh
left a comment
There was a problem hiding this comment.
Just a few suggestion and one question for n_ctx 🙏
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
59b9d61 to
f9d0f5a
Compare
ydshieh
left a comment
There was a problem hiding this comment.
Thank you @daniilgaltsev
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
* remove docstrings CodeGen from objects_to_ignore * autofix codegen docstrings * fill in the missing types and docstrings * fixup * change descriptions to be in a separate line * apply docstring suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * update n_ctx description in CodeGenConfig --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
What does this PR do?
Fixes the docstrings for CodeGen following #26638
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.