-
Notifications
You must be signed in to change notification settings - Fork 27.4k
[Torch2 CPU] torch._inductor.ir: [WARNING] Using FallbackKernel: aten.cumsum #93495
Copy link
Copy link
Closed
Labels
Description
🐛 Describe the bug
I'm trying to compile a UniXcoder model (variation of BERT) from huggingface transformers on CPU.
I use python version '3.8.16' and torch version '2.0.0.dev20221222+cpu'.
When performing model = torch.compile(model) with the default mode, as well as mode=reduce-overhead on a machine with 8GB ram, I encounter the error provided below.
Any idea how to get through it?
Thank you!
Error logs
[2022-12-22 14:41:22,477] torch._inductor.ir: [WARNING] Using FallbackKernel: aten.cumsum
[2022-12-22 14:41:35,196] torch._inductor.ir: [WARNING] Using FallbackKernel: aten.cumsum
[2022-12-22 14:41:45,469] torch._inductor.ir: [WARNING] Using FallbackKernel: aten.cumsum
[2022-12-22 14:41:56,081] torch._inductor.ir: [WARNING] Using FallbackKernel: aten.cumsum
[2022-12-22 14:42:06,499] torch._inductor.ir: [WARNING] Using FallbackKernel: aten.cumsum
[2022-12-22 14:42:19,296] torch._inductor.ir: [WARNING] Using FallbackKernel: aten.cumsum
[2022-12-22 14:42:30,450] torch._inductor.ir: [WARNING] Using FallbackKernel: aten.cumsum
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/output_graph.py", line 676, in call_user_compiler
compiled_fn = compiler_fn(gm, self.fake_example_inputs())
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/debug_utils.py", line 945, in debug_wrapper
compiled_gm = compiler_fn(gm, example_inputs, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/__init__.py", line 1151, in __call__
return self.compile_fn(model_, inputs_)
File "/usr/local/lib/python3.8/dist-packages/torch/_inductor/compile_fx.py", line 398, in compile_fx
return aot_autograd(
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/optimizations/training.py", line 78, in compiler_fn
cg = aot_module_simplified(gm, example_inputs, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_functorch/aot_autograd.py", line 2353, in aot_module_simplified
compiled_fn = create_aot_dispatcher_function(
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/utils.py", line 90, in time_wrapper
r = func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_functorch/aot_autograd.py", line 2050, in create_aot_dispatcher_function
compiled_fn = compiler_fn(flat_fn, fake_flat_tensor_args, aot_config)
File "/usr/local/lib/python3.8/dist-packages/torch/_functorch/aot_autograd.py", line 1305, in aot_wrapper_dedupe
return compiler_fn(flat_fn, leaf_flat_args, aot_config)
File "/usr/local/lib/python3.8/dist-packages/torch/_functorch/aot_autograd.py", line 955, in aot_dispatch_base
compiled_fw = aot_config.fw_compiler(fw_module, flat_args)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/utils.py", line 90, in time_wrapper
r = func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_inductor/compile_fx.py", line 373, in fw_compiler
return inner_compile(
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/debug_utils.py", line 507, in debug_wrapper
compiled_fn = compiler_fn(gm, example_inputs, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_inductor/debug.py", line 223, in inner
return fn(*args, **kwargs)
File "/usr/lib/python3.8/contextlib.py", line 75, in inner
return func(*args, **kwds)
File "/usr/local/lib/python3.8/dist-packages/torch/_inductor/compile_fx.py", line 140, in compile_fx_inner
compiled_fn = graph.compile_to_fn()
File "/usr/local/lib/python3.8/dist-packages/torch/_inductor/graph.py", line 538, in compile_to_fn
return self.compile_to_module().call
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/utils.py", line 90, in time_wrapper
r = func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_inductor/graph.py", line 527, in compile_to_module
mod = PyCodeCache.load(code)
File "/usr/local/lib/python3.8/dist-packages/torch/_inductor/codecache.py", line 461, in load
exec(code, mod.__dict__, mod.__dict__)
File "/tmp/torchinductor_root/ih/cihuzmkrufm4dzdsf7l5l6b7nhtybr7fexjtnk72btsrlnrnbtew.py", line 6242, in <module>
async_compile.wait(globals())
File "/usr/local/lib/python3.8/dist-packages/torch/_inductor/codecache.py", line 656, in wait
scope[key] = result.result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 444, in result
return self.__get_result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_inductor/codecache.py", line 633, in task
return CppCodeCache.load(source_code).kernel
File "/usr/local/lib/python3.8/dist-packages/torch/_inductor/codecache.py", line 438, in load
subprocess.check_output(cmd, stderr=subprocess.STDOUT)
File "/usr/lib/python3.8/subprocess.py", line 415, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/lib/python3.8/subprocess.py", line 493, in run
with Popen(*popenargs, **kwargs) as process:
File "/usr/lib/python3.8/subprocess.py", line 858, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/usr/lib/python3.8/subprocess.py", line 1639, in _execute_child
self.pid = _posixsubprocess.fork_exec(
OSError: [Errno 12] Cannot allocate memory
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "compile.py", line 596, in <module>
embeddings = get_embeddings(model, code_segments)
File "compile.py", line 582, in get_embeddings
_,code_embedding = model(source_ids)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1482, in _call_impl
return forward_call(*args, **kwargs)
File "/unixcoder.py", line 83, in forward
token_embeddings = self.model(source_ids,attention_mask = mask.unsqueeze(1) * mask.unsqueeze(2))[0]
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1482, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/eval_frame.py", line 83, in forward
return self.dynamo_ctx(self._orig_mod.forward)(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/eval_frame.py", line 212, in _fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/eval_frame.py", line 333, in catch_errors
return callback(frame, cache_size, hooks)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/convert_frame.py", line 480, in _convert_frame
result = inner_convert(frame, cache_size, hooks)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/convert_frame.py", line 103, in _fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/utils.py", line 90, in time_wrapper
r = func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/convert_frame.py", line 339, in _convert_frame_assert
return _compile(
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/convert_frame.py", line 400, in _compile
out_code = transform_code_object(code, transform)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/bytecode_transformation.py", line 341, in transform_code_object
transformations(instructions, code_options)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/convert_frame.py", line 387, in transform
tracer.run()
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/symbolic_convert.py", line 1684, in run
super().run()
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/symbolic_convert.py", line 538, in run
and self.step()
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/symbolic_convert.py", line 501, in step
getattr(self, inst.opname)(inst)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/symbolic_convert.py", line 1750, in RETURN_VALUE
self.output.compile_subgraph(self)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/output_graph.py", line 553, in compile_subgraph
self.compile_and_call_fx_graph(tx, pass2.graph_output_vars(), root)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/output_graph.py", line 600, in compile_and_call_fx_graph
compiled_fn = self.call_user_compiler(gm)
File "/usr/local/lib/python3.8/dist-packages/torch/_dynamo/output_graph.py", line 681, in call_user_compiler
raise BackendCompilerFailed(self.compiler_fn, e) from e
torch._dynamo.exc.BackendCompilerFailed: debug_wrapper raised OSError: [Errno 12] Cannot allocate memory
Set torch._dynamo.config.verbose=True for more information
You can suppress this exception and fall back to eager by setting:
torch._dynamo.config.suppress_errors = True
Minified repro
No response
Reactions are currently unavailable