[AOTI Eager] Add dynamic shapes support to AOTIPythonKernelHolder by StellarrZ · Pull Request #176018 · pytorch/pytorch

StellarrZ · 2026-02-27T21:23:21Z

Summary:
Add a dynamic parameter to AOTIPythonKernelHolder so that the C++ dispatch path can request dynamic-shape compilation from the Python compile backend. When dynamic_=true, the holder passes dynamic=True to aoti_compile_with_persistent_cache, and the in-memory cache uses rank/dtype/device matching instead of exact size/stride matching. This allows a single compiled kernel to serve multiple input shapes.

Changes:

AOTIPythonKernelHolder: added dynamic_ member, forwarded to produce_aoti_kernel_lib
AOTIKernelMetadata: added is_dynamic_ flag, check() uses dynamic_check() when set
TensorMetadata::dynamic_check(): matches by dtype/device/rank, skips exact sizes
ParameterMetadata::dynamic_check(): delegates to TensorMetadata::dynamic_check() for tensor params

Test Plan:

buck test fbcode//caffe2/test/cpp/aoti_eager:kernel_meta_info_test

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo

pytorch-bot · 2026-02-27T21:23:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/176018

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit eadb030 with merge base 1342f81 ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 1, 2, linux.2xlarge.amx, unstable) (gh) (#174929)
detectron2_maskrcnn_r_50_fpn

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…76018) Summary: Add a `dynamic` parameter to `AOTIPythonKernelHolder` so that the C++ dispatch path can request dynamic-shape compilation from the Python compile backend. When `dynamic_=true`, the holder passes `dynamic=True` to `aoti_compile_with_persistent_cache`, and the in-memory cache uses rank/dtype/device matching instead of exact size/stride matching. This allows a single compiled kernel to serve multiple input shapes. Changes: - `AOTIPythonKernelHolder`: added `dynamic_` member, forwarded to `produce_aoti_kernel_lib` - `AOTIKernelMetadata`: added `is_dynamic_` flag, `check()` uses `dynamic_check()` when set - `TensorMetadata::dynamic_check()`: matches by dtype/device/rank, skips exact sizes - `ParameterMetadata::dynamic_check()`: delegates to `TensorMetadata::dynamic_check()` for tensor params Test Plan: ``` buck test fbcode//caffe2/test/cpp/aoti_eager:kernel_meta_info_test ``` Differential Revision: D94301187

…76018) Summary: Pull Request resolved: #176018 Add a `dynamic` parameter to `AOTIPythonKernelHolder` so that the C++ dispatch path can request dynamic-shape compilation from the Python compile backend. When `dynamic_=true`, the holder passes `dynamic=True` to `aoti_compile_with_persistent_cache`, and the in-memory cache uses rank/dtype/device matching instead of exact size/stride matching. This allows a single compiled kernel to serve multiple input shapes. Changes: - `AOTIPythonKernelHolder`: added `dynamic_` member, forwarded to `produce_aoti_kernel_lib` - `AOTIKernelMetadata`: added `is_dynamic_` flag, `check()` uses `dynamic_check()` when set - `TensorMetadata::dynamic_check()`: matches by dtype/device/rank, skips exact sizes - `ParameterMetadata::dynamic_check()`: delegates to `TensorMetadata::dynamic_check()` for tensor params Test Plan: ``` buck test fbcode//caffe2/test/cpp/aoti_eager:kernel_meta_info_test ``` Differential Revision: D94301187

meta-codesync · 2026-03-02T17:19:34Z

@StellarrZ has exported this pull request. If you are a Meta employee, you can view the originating Diff in D94301187.