Skip to content

[inductor] Add an AOT compilation mode for Inductor CPP backend#94822

Closed
desertfire wants to merge 8 commits intogh/desertfire/69/basefrom
gh/desertfire/69/head
Closed

[inductor] Add an AOT compilation mode for Inductor CPP backend#94822
desertfire wants to merge 8 commits intogh/desertfire/69/basefrom
gh/desertfire/69/head

Conversation

@desertfire
Copy link
Contributor

@desertfire desertfire commented Feb 14, 2023

Stack from ghstack (oldest at bottom):

Summary: The AOT mode currently works for the CPP backend. When turned on, Inductor compiles the model code into a .so file with aot_inductor_entry as the entry function. If the AOT compilation fails, Inductor will explicitly fail.

cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10

Summary: The AOT mode currently works for the CPP backend. When turned
on, Inductor compiles the model code into a .so file with
__aot_inductor_entry as the entry function.
If the AOT compilation fails, Inductor will explicitly fail.

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 14, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94822

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Failures

As of commit aeb7045:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: The AOT mode currently works for the CPP backend. When turned
on, Inductor compiles the model code into a .so file with
__aot_inductor_entry as the entry function.
If the AOT compilation fails, Inductor will explicitly fail.

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10

[ghstack-poisoned]
@desertfire
Copy link
Contributor Author

Output from running test.sh,

Turning on aten_graph for aot_inductor
[2023-02-14 15:12:29,045] torch._inductor.compile_fx: [INFO] Step 3: torchinductor compiling FORWARDS graph 0

#include "/tmp/torchinductor_binbao/cd/ccdu23xgmx4kl3rilvo5rfytlffccwsazjbxkv4urqfixqspbwj4.h"
extern "C" void kernel_cpp_0(const float* __restrict__ in_ptr0,
                       float* __restrict__ out_ptr0,
                       float* __restrict__ out_ptr1)
{
    {
        for(long i0=0; i0<512; i0+=1)
        {
            auto tmp0 = at::vec::Vectorized<float>::loadu(in_ptr0 + 16*i0);
            auto tmp1 = tmp0.sin();
            auto tmp2 = decltype(tmp1)(1)/(decltype(tmp1)(1) + tmp1.neg().exp());
            auto tmp3 = tmp0.cos();
            auto tmp4 = decltype(tmp3)(1)/(decltype(tmp3)(1) + tmp3.neg().exp());
            tmp2.store(out_ptr0 + 16*i0);
            tmp4.store(out_ptr1 + 16*i0);
        }
        #pragma omp simd simdlen(8) 
        for(long i0=8192; i0<8192; i0+=1)
        {
            auto tmp0 = in_ptr0[i0];
            auto tmp1 = std::sin(tmp0);
            auto tmp2 = std::exp(-tmp1);
            auto tmp3 = 1 / (1 + tmp2);
            auto tmp4 = std::cos(tmp0);
            auto tmp5 = std::exp(-tmp4);
            auto tmp6 = 1 / (1 + tmp5);
            out_ptr0[i0] = tmp3;
            out_ptr1[i0] = tmp6;
        }
    }
}
std::vector<at::Tensor> __aot_inductor_entry(std::vector<at::Tensor> args) {
    at::Tensor arg0_1;
    arg0_1 = args[0];
    auto buf0 = at::empty_strided({8, 4, 16, 16}, {1024, 256, 16, 1}, at::ScalarType::Float); 
    auto buf1 = at::empty_strided({8, 4, 16, 16}, {1024, 256, 16, 1}, at::ScalarType::Float); 
    kernel_cpp_0((float*)(arg0_1.data_ptr()), (float*)(buf0.data_ptr()), (float*)(buf1.data_ptr()));
    arg0_1.reset();
    return std::vector<at::Tensor>({buf0, buf1});
}

[2023-02-14 15:12:35,669] torch._inductor.codecache: [INFO] AOT-Inductor compiles code into: /scratch/binbao/work/pytorch/test/inductor/aot/build/aot_inductor_output.so

Summary: The AOT mode currently works for the CPP backend. When turned
on, Inductor compiles the model code into a .so file with
__aot_inductor_entry as the entry function.
If the AOT compilation fails, Inductor will explicitly fail.

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Feb 14, 2023
Summary: The AOT mode currently works for the CPP backend. When turned
on, Inductor compiles the model code into a .so file with
__aot_inductor_entry as the entry function.
If the AOT compilation fails, Inductor will explicitly fail.

ghstack-source-id: 52d0af8
Pull Request resolved: #94822
@desertfire desertfire added the topic: not user facing topic category label Feb 14, 2023
Copy link
Contributor

@jansel jansel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are model weights handled by this?

@voznesenskym
Copy link
Collaborator

Discussed offline a bit - but you don't need to make it part of export. You can make AOTInductor a purely additive thing.

One thing we could do that would be minimal changes is to not change how we produce a module or wrapper today under cpp_wrapper but instead make this aot thing purely additive - produce a side artifact of the .so (as we do), and a .h (As we do not yet do), if a flag is set. This would allow us to button up an impl in very flew lines, at the cost of a little bit of redundant work.

Check out how we do

if self._can_use_cpp_wrapper:
    self.wrapper_code = CppWrapperCodeGen()

You can do something like

if self.aot

And provide your own AOTCodeGen alongside the other CodeGen (either composition, or add support to emit multiple codegen?). You can then use the flag to produce a pure compiled artifact as a side effect - no need to change any of the mainline flow of compilation.

Once you have that - you can just call compile_fx after export, with the aot flag enabled.

Summary: The AOT mode currently works for the CPP backend. When turned
on, Inductor compiles the model code into a .so file with
__aot_inductor_entry as the entry function.
If the AOT compilation fails, Inductor will explicitly fail.

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Feb 17, 2023
Summary: The AOT mode currently works for the CPP backend. When turned
on, Inductor compiles the model code into a .so file with
__aot_inductor_entry as the entry function.
If the AOT compilation fails, Inductor will explicitly fail.

ghstack-source-id: fb078c9
Pull Request resolved: #94822
Copy link
Contributor

@jansel jansel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is here looks good to me.

Still need to solve the issue of weight handling.

Should __aot_inductor_entry be just aot_inductor_entry, or perhaps a function name passed in by the user to avoid name conflicts? If the user is intended to call it we shouldn't name it __*.

@desertfire
Copy link
Contributor Author

I will update with refactoring after #95594 lands. Weight handling will come as the next PR.

Summary: The AOT mode currently works for the CPP backend. When turned
on, Inductor compiles the model code into a .so file with
__aot_inductor_entry as the entry function.
If the AOT compilation fails, Inductor will explicitly fail.

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Mar 1, 2023
Summary: The AOT mode currently works for the CPP backend. When turned
on, Inductor compiles the model code into a .so file with
aot_inductor_entry as the entry function.
If the AOT compilation fails, Inductor will explicitly fail.

ghstack-source-id: 226b18f
Pull Request resolved: #94822
@desertfire desertfire changed the title [WIP][inductor] Add an AOT compilation mode for Inductor [inductor] Add an AOT compilation mode for Inductor CPP backend Mar 1, 2023
…ckend"


Summary: The AOT mode currently works for the CPP backend. When turned on, Inductor compiles the model code into a .so file with aot_inductor_entry as the entry function. If the AOT compilation fails, Inductor will explicitly fail.

cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10

[ghstack-poisoned]
@clee2000
Copy link
Contributor

clee2000 commented Mar 3, 2023

Also, this PR seems to almost quadruple the time it takes to run inductor/test_torchinductor_opinfo (prev ~1hr, after ~3.75 hr)

@desertfire desertfire closed this Mar 3, 2023
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 5, 2023
Summary: The AOT mode currently works for the CPP backend. When turned on, Inductor compiles the model code into a .so file with aot_inductor_entry as the entry function. If the AOT compilation fails, Inductor will explicitly fail.

Pull Request resolved: pytorch/pytorch#94822
Approved by: https://github.com/jansel
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 5, 2023
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 5, 2023
Summary: The AOT mode currently works for the CPP backend. When turned on, Inductor compiles the model code into a .so file with aot_inductor_entry as the entry function. If the AOT compilation fails, Inductor will explicitly fail.

Pull Request resolved: pytorch/pytorch#94822
Approved by: https://github.com/jansel
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 5, 2023
desertfire added a commit that referenced this pull request Mar 5, 2023
Summary: This is a reland of #94822

ghstack-source-id: 1aa9136
Pull Request resolved: #95985
desertfire added a commit that referenced this pull request Mar 5, 2023
…mode for Inductor CPP backend"

Summary: This is a reland of #94822

cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Mar 5, 2023
…r CPP backend"

Summary: This is a reland of #94822

cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Mar 7, 2023
…mode for Inductor CPP backend"

Summary: This is a reland of #94822

cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Mar 7, 2023
…r CPP backend"

Summary: This is a reland of #94822

cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Mar 7, 2023
Summary: This is a reland of #94822

ghstack-source-id: dc88f33
Pull Request resolved: #95985
pytorchmergebot pushed a commit that referenced this pull request Mar 8, 2023
desertfire added a commit that referenced this pull request Mar 10, 2023
Summary: This is a reland of #94822.
Solved the long compilation issue for inductor cpp tests.

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Mar 10, 2023
Summary: This is a reland of #94822.
Solved the long compilation issue for inductor cpp tests.

ghstack-source-id: 9f696b4
Pull Request resolved: #96520
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 12, 2023
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 12, 2023
desertfire added a commit that referenced this pull request Mar 13, 2023
… mode for Inductor CPP backend"

Summary: This is a reland of #94822.
Solved the long compilation issue for inductor cpp tests.

cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Mar 13, 2023
…or CPP backend"

Summary: This is a reland of #94822.
Solved the long compilation issue for inductor cpp tests.

cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10

[ghstack-poisoned]
ydwu4 added a commit to ydwu4/pytorch that referenced this pull request Mar 13, 2023
…rch#94822)

Summary: The AOT mode currently works for the CPP backend. When turned on, Inductor compiles the model code into a .so file with aot_inductor_entry as the entry function. If the AOT compilation fails, Inductor will explicitly fail.

Pull Request resolved: pytorch#94822
Approved by: https://github.com/jansel
ydwu4 added a commit to ydwu4/pytorch that referenced this pull request Mar 13, 2023
pytorchmergebot pushed a commit that referenced this pull request Mar 14, 2023
…end (#96520)

Summary: This is a reland of #94822.
Solved the long compilation issue for inductor cpp tests.

Pull Request resolved: #96520
Approved by: https://github.com/huydhn, https://github.com/malfet
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 23, 2023
…end (#96520)

Summary: This is a reland of pytorch/pytorch#94822.
Solved the long compilation issue for inductor cpp tests.

Pull Request resolved: pytorch/pytorch#96520
Approved by: https://github.com/huydhn, https://github.com/malfet
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 27, 2023
…end (#96520)

Summary: This is a reland of pytorch/pytorch#94822.
Solved the long compilation issue for inductor cpp tests.

Pull Request resolved: pytorch/pytorch#96520
Approved by: https://github.com/huydhn, https://github.com/malfet
@facebook-github-bot facebook-github-bot deleted the gh/desertfire/69/head branch June 8, 2023 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants