Gemlite fixes#1432
Conversation
Summary: shapes need to be divisible by 128 or they will not work with gemlite need fp32 accumulation for groupsize None on int4 Test Plan: python test_integration.py -k "test_gemlite" (new test for non divisible shape)a python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-8-4-None --write_result benchmark_results.txt python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-32-4-None --write_result benchmark_results.txta (previously these gave nonsense responses) Reviewers: Subscribers: Tasks: Tags:
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1432
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit 1797c75 with merge base 33d57af ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
| return abs(other_event.event_time - self.event_time) * 1000 | ||
|
|
||
|
|
||
| def get_arch_name() -> str: |
There was a problem hiding this comment.
why these changes? is this some rebase issue
Summary: Resubmitting fixes from @HDCharles in pytorch#1432 since that seems to have issues with rebase Test Plan: see pytorch#1432 Reviewers: Subscribers: Tasks: Tags:
| if _layout.group_size == None and _layout.bit_width == 4: | ||
| from gemlite.core import GEMLITE_ACC_DTYPE | ||
| from gemlite.dtypes import DType | ||
| GEMLITE_ACC_DTYPE[DType.FP16] = DType.FP32 |
There was a problem hiding this comment.
This will only work when all the layers use the same group_size, which is ok for now.
The other option will be using this https://github.com/mobiusml/gemlite/blob/master/gemlite/core.py#L87 but for now let's keep it like this
There was a problem hiding this comment.
I tested this manually, it works in all cases even when there are different group sizes.
There was a problem hiding this comment.
I mean when different layers use different settings within the same model, but let's not worry about that !
Summary: Resubmitting pytorch#1432 since it has some rebase issues and we want to merge the fix asap Test Plan: see pytorch#1432 Reviewers: Subscribers: Tasks: Tags:
|
landed in #1435, please feel free to submit any follow up fixes |
Summary:
shapes need to be divisible by 128 or they will not work with gemlite need fp32 accumulation for groupsize None on int4
Test Plan:
python test_integration.py -k "test_gemlite" (new test for non divisible shape)a
python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-8-4-None --write_result benchmark_results.txt python generate.py --checkpoint_path
$CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-32-4-None --write_result benchmark_results.txta
(previously these gave nonsense responses)
Reviewers:
Subscribers:
Tasks:
Tags: