Skip to content

[Graph] Quantization graph pass#299

Closed
Aalanli wants to merge 7 commits intohidet-org:mainfrom
Aalanli:quantization-graph-pass
Closed

[Graph] Quantization graph pass#299
Aalanli wants to merge 7 commits intohidet-org:mainfrom
Aalanli:quantization-graph-pass

Conversation

@Aalanli
Copy link
Copy Markdown
Contributor

@Aalanli Aalanli commented Jun 30, 2023

I am unsure if this is the best way to achieve the graph pass mechanism, since this does not have the advantage of memory savings, as we wait for all the weights to be loaded before doing the graph pass. Please see hidet/examples/quantization/gpt2.py

There are some considerations:

  1. Usually, automatic quantization is done at the layer level. I have written some subgraph rewrite passes with some heuristics to simulate this, but its not the same. We could either give ops an optional attribute hinting at the parent layer, or we could insert quantized versions of layers higher up the chain. Leading to 2
  2. We can insert quantized layers in the interpreter of dynamo frontend, or we do some monkey patching like torch does, if the model is written in hidet.
  3. Or we do things in the graph level.

@Aalanli Aalanli requested a review from yaoyaoding July 4, 2023 01:06
@Aalanli Aalanli closed this Jul 11, 2023
@Aalanli Aalanli deleted the quantization-graph-pass branch September 27, 2023 18:09
vadiklyutiy pushed a commit that referenced this pull request Dec 19, 2024
…. ) (#294)

[Ir][Primitives] add vectorized conversion instructions
[Ir][CuTe] add reduce primitives in cute (#295)
[Ir][CuTe] add mma primitives (#296)
[Ir][CuTe] add other primitives in cute (#297)
[Transforms][CuTe] add instruction selection pass (#298)
[Transforms][CuTe] add resolve bank conflict pass (#299)
[Transforms][CuTe] add resolve auto keywords pass (#300)
[Transforms][CuTe] add shared memory allocation pass (#301)
[Transforms][CuTe] add vectorize elementwise operation pass (#302)
[Transforms][CuTe] add analysis pass (#303)
[Transforms][CuTe] add canonicalization pass (#304)
[Transforms][CuTe] add deadcode elimination pass (#305)
[Transforms][CuTe] refactor cute lowering pass (#306)
[Graph][Ops] matmul cute (#307)
[Ir] cute miscs (#308)
[Tests] cute tests (#309)
[Chore] fix ci (#313)
---------

Co-authored-by: xiaocenxiaocen <xiao.zhang@centml.ai>
vadiklyutiy pushed a commit that referenced this pull request Dec 20, 2024
…. ) (#294)

[Ir][Primitives] add vectorized conversion instructions
[Ir][CuTe] add reduce primitives in cute (#295)
[Ir][CuTe] add mma primitives (#296)
[Ir][CuTe] add other primitives in cute (#297)
[Transforms][CuTe] add instruction selection pass (#298)
[Transforms][CuTe] add resolve bank conflict pass (#299)
[Transforms][CuTe] add resolve auto keywords pass (#300)
[Transforms][CuTe] add shared memory allocation pass (#301)
[Transforms][CuTe] add vectorize elementwise operation pass (#302)
[Transforms][CuTe] add analysis pass (#303)
[Transforms][CuTe] add canonicalization pass (#304)
[Transforms][CuTe] add deadcode elimination pass (#305)
[Transforms][CuTe] refactor cute lowering pass (#306)
[Graph][Ops] matmul cute (#307)
[Ir] cute miscs (#308)
[Tests] cute tests (#309)
[Chore] fix ci (#313)
---------

Co-authored-by: xiaocenxiaocen <xiao.zhang@centml.ai>
vadiklyutiy pushed a commit that referenced this pull request Dec 26, 2024
…. ) (#294)

[Ir][Primitives] add vectorized conversion instructions
[Ir][CuTe] add reduce primitives in cute (#295)
[Ir][CuTe] add mma primitives (#296)
[Ir][CuTe] add other primitives in cute (#297)
[Transforms][CuTe] add instruction selection pass (#298)
[Transforms][CuTe] add resolve bank conflict pass (#299)
[Transforms][CuTe] add resolve auto keywords pass (#300)
[Transforms][CuTe] add shared memory allocation pass (#301)
[Transforms][CuTe] add vectorize elementwise operation pass (#302)
[Transforms][CuTe] add analysis pass (#303)
[Transforms][CuTe] add canonicalization pass (#304)
[Transforms][CuTe] add deadcode elimination pass (#305)
[Transforms][CuTe] refactor cute lowering pass (#306)
[Graph][Ops] matmul cute (#307)
[Ir] cute miscs (#308)
[Tests] cute tests (#309)
[Chore] fix ci (#313)
---------

Co-authored-by: xiaocenxiaocen <xiao.zhang@centml.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant