Skip to content

[Refactor] Remove schedules submodule under hidet.graph.ops#269

Merged
yaoyaoding merged 3 commits intohidet-org:mainfrom
yaoyaoding:refactor-ops-dir
Jun 3, 2023
Merged

[Refactor] Remove schedules submodule under hidet.graph.ops#269
yaoyaoding merged 3 commits intohidet-org:mainfrom
yaoyaoding:refactor-ops-dir

Conversation

@yaoyaoding
Copy link
Copy Markdown
Member

We have rewritten all schedules templates in Hidet Script, thus, we remove the schedules submodule previously used to put the IR Builder based schedule templates.

@yaoyaoding yaoyaoding merged commit ca607f9 into hidet-org:main Jun 3, 2023
@yaoyaoding yaoyaoding deleted the refactor-ops-dir branch June 3, 2023 22:30
vadiklyutiy added a commit that referenced this pull request Jul 22, 2024
In this review, I redo the pick-up of the best candidate.

Now statistical t-test is used to estimate what schedule is better. Moved this code to python/hidet/utils/benchmark/bench.py` together with another similar code.

**Performance improvement for bs=1, A10G**

`python tests/benchmarks/bench_vision.py resnet50 --params 1x3x224x224 --dtype float16`

_**Before**_
0.7848
0.7803
0.7808
0.7839
0.7821
0.7887
0.7785
0.7843
0.7857
0.7939
median = 0.7841
stddev = 0.45%

_**After**_
0.7717
0.7708
0.7679
0.7662
0.7717
0.7715
0.7698
0.7692
0.7706
0.7720
median = 0.7707 (**improvement 1.7%**)
stddev = 0.19% (**improvement 2.37x**)


**Compilation time improvement**
g5.x16large instance. 64 threads/32 cores
`time python tests/benchmarks/bench_op.py batch_matmul --params
1x4096x4096,1x4096x4096 --dtype float16`

**_Before_**
real    5m9s

_**After**_
real    2m27s
vadiklyutiy added a commit that referenced this pull request Jul 23, 2024
In this review, I redo the pick-up of the best candidate.

Now statistical t-test is used to estimate what schedule is better. Moved this code to python/hidet/utils/benchmark/bench.py` together with another similar code.

**Performance improvement for bs=1, A10G**

`python tests/benchmarks/bench_vision.py resnet50 --params 1x3x224x224 --dtype float16`

_**Before**_
0.7848
0.7803
0.7808
0.7839
0.7821
0.7887
0.7785
0.7843
0.7857
0.7939
median = 0.7841
stddev = 0.45%

_**After**_
0.7717
0.7708
0.7679
0.7662
0.7717
0.7715
0.7698
0.7692
0.7706
0.7720
median = 0.7707 (**improvement 1.7%**)
stddev = 0.19% (**improvement 2.37x**)


**Compilation time improvement**
g5.x16large instance. 64 threads/32 cores
`time python tests/benchmarks/bench_op.py batch_matmul --params
1x4096x4096,1x4096x4096 --dtype float16`

**_Before_**
real    5m9s

_**After**_
real    2m27s
vadiklyutiy added a commit that referenced this pull request Dec 26, 2024
In this review, I redo the pick-up of the best candidate.

Now statistical t-test is used to estimate what schedule is better. Moved this code to python/hidet/utils/benchmark/bench.py` together with another similar code.

**Performance improvement for bs=1, A10G**

`python tests/benchmarks/bench_vision.py resnet50 --params 1x3x224x224 --dtype float16`

_**Before**_
0.7848
0.7803
0.7808
0.7839
0.7821
0.7887
0.7785
0.7843
0.7857
0.7939
median = 0.7841
stddev = 0.45%

_**After**_
0.7717
0.7708
0.7679
0.7662
0.7717
0.7715
0.7698
0.7692
0.7706
0.7720
median = 0.7707 (**improvement 1.7%**)
stddev = 0.19% (**improvement 2.37x**)


**Compilation time improvement**
g5.x16large instance. 64 threads/32 cores
`time python tests/benchmarks/bench_op.py batch_matmul --params
1x4096x4096,1x4096x4096 --dtype float16`

**_Before_**
real    5m9s

_**After**_
real    2m27s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant