[Quantization] fix quantization pass bug by Aalanli · Pull Request #355 · hidet-org/hidet

Aalanli · 2023-08-24T20:03:11Z

For llama 7b, decoding with no prefill

fp16 - 128 tokens
org_t: 3.048099994659424s
avg_t: 0.0044784750789403915s

int8 - 128 tokens
org_t: 2.2229115962982178s
avg_t: 0.0038476772606372833s

python/hidet/graph/transforms/__init__.py

Introduce `add_hint_pass`. It adds `__builtin_assume(...)` to .cu code that helps nvcc to understand bounds if `threadIdx` and `blockIdx` and optimize code better. **Performance improvements.** Models model|latency|prev_latency|ratio| |--------|--------|--------|--------| bert-base-uncased|19.8138|20.2316|2.109 densenet121|35.1161|36.7627|4.689 efficientnet_b0|18.9451|19.278|1.757 mobilenet_v2|11.5944|11.8764|2.432 resnet50|29.4878|29.9935|1.715 vit_b_16|125.787|123.672|-1.681 Operators operator|latency|prev_latency|ratio |--------|--------|--------|--------| attn|1.50402|1.50131|-0.18 attn|0.219707|0.227568|3.578 attn_mask_add|1.5892|1.62516|2.263 attn_mask_add|0.226317|0.226507|0.084 batch_matmul|5.2399|5.11547|-2.375 batch_matmul|0.0216016|0.0223425|3.43 conv2d|0.0347093|0.0341758|-1.537 conv2d|0.310521|0.308458|-0.664 conv2d_gemm_f16|0.142542|0.146412|2.715 conv2d_gemm_f16|2.0421|2.07043|1.387 matmul_f16|2.22432|2.30458|3.608 matmul_f16|0.00888628|0.00892615|0.449 reduce|0.01375|0.0138618|0.813

fix pass bug

a849b5e

yaoyaoding approved these changes Aug 25, 2023

View reviewed changes

python/hidet/graph/transforms/__init__.py Outdated Show resolved Hide resolved

remove unnecessary pass

3731264

Aalanli merged commit 021b067 into hidet-org:main Aug 29, 2023

Aalanli deleted the quant-improvement branch August 29, 2023 00:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Quantization] fix quantization pass bug#355

[Quantization] fix quantization pass bug#355
Aalanli merged 2 commits intohidet-org:mainfrom
Aalanli:quant-improvement

Aalanli commented Aug 24, 2023

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Aalanli commented Aug 24, 2023

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants