Skip to content

[Operator] Update batch_matmul to use Hidet Script#207

Merged
hjjq merged 28 commits intohidet-org:mainfrom
hjjq:script
May 4, 2023
Merged

[Operator] Update batch_matmul to use Hidet Script#207
hjjq merged 28 commits intohidet-org:mainfrom
hjjq:script

Conversation

@hjjq
Copy link
Copy Markdown
Collaborator

@hjjq hjjq commented May 2, 2023

  • The goal is to remove schedules written with IRBuilder, which are under graph.ops.schedules.cuda, and replace them with Hidet Script versions of the schedules under graph.ops.definitions. This PR focuses on matmul
  • Copied graph.ops.schedules.cuda.common into graph.ops.definitions.utils.schedule_utils in order to avoid import dependency loop.
  • Loop Unroll attributes are added to mma configs to allow unrolling in mma version.
  • wmma version will be removed and there will be no replacement.
  • Set the maximum tune space to 2000 schedules

@hjjq hjjq merged commit 3000aea into hidet-org:main May 4, 2023
@hjjq hjjq deleted the script branch May 4, 2023 15:09
vadiklyutiy added a commit that referenced this pull request Jul 22, 2024
-  Adopted our scripts to use `mode` from `torch.compile`
- Changed `regroup_modules` from `build_ir_module_batch` to don't create
jobs bigger than `MAX_JOB_PER_WORKER` (fix issue #207)
- A little bit cut/optimize tests for torch backend
  - move densnet121 to slow
  - move resnet50 to slow but add resnet18

(the last 2 points came from attempts to enable `mode='max-autotune'` in
Tests. Still has additional issues)
vadiklyutiy added a commit that referenced this pull request Jul 23, 2024
-  Adopted our scripts to use `mode` from `torch.compile`
- Changed `regroup_modules` from `build_ir_module_batch` to don't create
jobs bigger than `MAX_JOB_PER_WORKER` (fix issue #207)
- A little bit cut/optimize tests for torch backend
  - move densnet121 to slow
  - move resnet50 to slow but add resnet18

(the last 2 points came from attempts to enable `mode='max-autotune'` in
Tests. Still has additional issues)
vadiklyutiy added a commit that referenced this pull request Dec 26, 2024
-  Adopted our scripts to use `mode` from `torch.compile`
- Changed `regroup_modules` from `build_ir_module_batch` to don't create
jobs bigger than `MAX_JOB_PER_WORKER` (fix issue #207)
- A little bit cut/optimize tests for torch backend
  - move densnet121 to slow
  - move resnet50 to slow but add resnet18

(the last 2 points came from attempts to enable `mode='max-autotune'` in
Tests. Still has additional issues)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants