fix to jamba config, asserting attention and expert offset#33316
fix to jamba config, asserting attention and expert offset#33316amyeroberts merged 10 commits intohuggingface:mainfrom ErezSC42:jamba-config-fix
Conversation
|
I think it would be more appropriate to move this to a separate (private) function and instead of an assertion error, raise a |
|
Ah and maybe add a simple test to check if the expected errors are raised 👀 |
| def _check_supported_offset(t_: str, period: int, offset: int): | ||
| if offset >= period: | ||
| raise ValueError(f"{t_} layer offset ({offset}) must be smaller than {t_} layer period ({period})") |
There was a problem hiding this comment.
Just two small nits:
- Could you change
t_, it's not really descriptive. - Maybe move it under
JambaConfig.
|
Jamba, cc @ArthurZucker |
|
I changed t_ to a be more descriptive, but since the _check_supported_offset function is called inside JambaConfig's init, it cannot be a method defined inside the class |
|
Thank you! For transformers/src/transformers/models/olmo/configuration_olmo.py Lines 150 to 181 in f38590d At least that's what I had in mind. |
|
my bad, it should be working now |
|
You can run |
|
Hey, all the tests passed, is this PR good for merging? |
|
It needs a review from a core maintainer, so we can just wait. |
|
Maybe cc @amyeroberts |
amyeroberts
left a comment
There was a problem hiding this comment.
LGTM - thanks for adding!
…ce#33316) * fix to jamba config, asserting attention and expert offset * fix foramtting * fix foramtting * fix foramtting * changed to error raise instead of assertion, added unittests * fix * changed t_ to property_ * changed t_ to property_ * quickfix * ran code styler
What does this PR do?
Ensures instanced models are supported by adding assertions to attention offset and expert offset values