Skip to content

Inductor: don't reuse buffer if it would increase peak memory #145883

@eellison

Description

@eellison

🚀 The feature, motivation and pitch

Inductor has a config allow_buffer_reuse which will reuse a dead Tensor in memory allocation if the Tensor matches the newly allocated # of bytes. In some cases, if we are reusing a buffer during peak memory, this can increase memory usage.

We should track current allocated and peak memory during inductor codegen. and only reuse a buffer if it does not increase peak memory. We already do a similar memory tracking here. See, buffer reuse logic.

Alternatives

master

Additional context

No response

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

Metadata

Metadata

Assignees

Labels

internal ramp-up taskTasks that are suitable for new folks w/ high-touch guidance from senior PyTorch folksmodule: inductoroncall: pt2triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions