🚀 The feature, motivation and pitch
Inductor has a config allow_buffer_reuse which will reuse a dead Tensor in memory allocation if the Tensor matches the newly allocated # of bytes. In some cases, if we are reusing a buffer during peak memory, this can increase memory usage.
We should track current allocated and peak memory during inductor codegen. and only reuse a buffer if it does not increase peak memory. We already do a similar memory tracking here. See, buffer reuse logic.
Alternatives
master
Additional context
No response
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov
🚀 The feature, motivation and pitch
Inductor has a config
allow_buffer_reusewhich will reuse a dead Tensor in memory allocation if the Tensor matches the newly allocated # of bytes. In some cases, if we are reusing a buffer during peak memory, this can increase memory usage.We should track current allocated and peak memory during inductor codegen. and only reuse a buffer if it does not increase peak memory. We already do a similar memory tracking here. See, buffer reuse logic.
Alternatives
master
Additional context
No response
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov