[Operator] Make Convolution gemms fusible by resolving to batch_matmul by hjjq · Pull Request #279 · hidet-org/hidet

hjjq · 2023-06-15T20:42:38Z

No description provided.

yaoyaoding · 2023-06-16T19:25:50Z

Thanks @hjjq !

Right now we have sufficient fixed overhead for model run (#279). **no cudagraph** Below: run of an empty model, no cudagraph times in ms. Before. Inductor overhead is 0.052 = 0.032 + 0.02 where 0.02 is overhead before entering in compiler and 0.032 directly inductor overhead. Hidet overhead is 0.205 = 0.185 + 0.02 After. Hidet overhead is 0.068 = 0.048 + 0.02 Overhead reduced from 0.185ms -> 0.048ms or by 3.85x **cudagraph** Before 0.162ms After 0.124ms Inductor 0.88ms For cudagraph there is one more room for improvement(left TODO in code).

hjjq added 2 commits June 15, 2023 16:37

Make conv2d resolve to batch_matmul

5dad024

do the same for other conv gemms

e3074b4

yaoyaoding merged commit d6e431e into hidet-org:main Jun 16, 2023

hjjq deleted the require_prologue branch June 24, 2023 02:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Operator] Make Convolution gemms fusible by resolving to batch_matmul#279

[Operator] Make Convolution gemms fusible by resolving to batch_matmul#279
yaoyaoding merged 2 commits intohidet-org:mainfrom
hjjq:require_prologue

hjjq commented Jun 15, 2023

Uh oh!

yaoyaoding commented Jun 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hjjq commented Jun 15, 2023

Uh oh!

yaoyaoding commented Jun 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants