Skip to content

ID3D12CommandAllocator Error for heavy computation pipeline #2285

@haixuanTao

Description

@haixuanTao

Description
I have written a library that transform Deep Learning models into WGPU compute pipeline, called wonnx: https://github.com/haixuanTao/wonnx

The library works on a mnist model on Windows DX12 but larger model like squeezenet fail to run on DX12.
Both mnist and squeezenet works on Linux VULKAN on github action and local with NVIDIA card.

The error I get is:

ID3D12CommandAllocator::Reset: A command allocator 0x000001C4C43FE4C0:'Unnamed ID3D12CommandAllocator Object' is being reset before previous executions associated with the allocator have completed. [ EXECUTION ERROR #552: COMMAND_ALLOCATOR_SYNC]

I have scoped the error to the line:

device.poll(wgpu::Maintain::Wait);

From my research, I think this error has to do with the high number of compute pipeline as squeezenet is 10x larger than mnist.

I have gotten this error on a vagrant VM and github action VM and it might be caused by the virtualisation.

Repro steps
To reproduce the error, you can clone my repo:

SETX RUST_LOG debug
git clone https://github.com/haixuanTao/wonnx
git checkout 71e25a47f5ed831fa96499b77084424188b2e35d
cargo run --example squeeze

You can also run the test that should be passing

cargo test

You can also check my github action here: https://github.com/haixuanTao/wonnx/actions/runs/1569686479 that has test check for both linux x86 and windows x86.

Expected vs observed behavior
I would expect Windows to either fail both MNIST and SQUEEZENET if it was an implementation problem.

Extra materials

time: pre_run: 24.2054ms                                                                                                
[2021-12-12T18:56:15Z INFO  wgpu_core::device] Created buffer Valid((53, 2, Dx12)) with BufferDescriptor { label: Some("staging_squeezenet0_flatten0_reshape0"), size: 4000, usage: MAP_READ | COPY_DST, mapped_at_creation: false }            
time: run: 159.6157ms                                                                                                   
time: run: 200.1022ms                                                                                                   
[2021-12-12T18:56:20Z ERROR wgpu_hal::dx12::instance] ID3D12CommandAllocator::Reset: A command allocator 0x000002035A627380:'Unnamed ID3D12CommandAllocator Object' is being reset before previous executions associated with the allocator have completed. [ EXECUTION ERROR #552: COMMAND_ALLOCATOR_SYNC]                                                             
[2021-12-12T18:56:20Z WARN  wgpu_hal::dx12::instance] Process is terminating. Using simple reporting. Please call ReportLiveObjects() at runtime for standard reporting.                                                                        
[2021-12-12T18:56:20Z WARN  wgpu_hal::dx12::instance] Live Producer at 0x00000203498D5A98, Refcount: 330.               
[2021-12-12T18:56:20Z WARN  wgpu_hal::dx12::instance]   Live Object at 0x0000020349916220, Refcount: 0.                 
[2021-12-12T18:56:20Z WARN  wgpu_hal::dx12::instance]   Live Object at 0x0000020349F3B600, Refcount: 0.                 
[2021-12-12T18:56:20Z WARN  wgpu_hal::dx12::instance]   Live Object at 0x0000020349F787F0, Refcount: 0.                 
[2021-12-12T18:56:20Z WARN  wgpu_hal::dx12::instance]   Live Object at 0x0000020349F792F0, Refcount: 0.                 
[2021-12-12T18:56:20Z 
.......
wgpu_hal::dx12::instance]   Live Object at 0x000002034A048A00, Refcount: 0.                 

[2021-12-12T18:56:20Z WARN  wgpu_hal::dx12::instance] Live                         Object :      8                      error: process didn't exit successfully: `target\debug\examples\squeeze.exe` (exit code: 1)    

Platform
The vagrant VM I am using is the following: https://github.com/nbigaouette/windows_vagrant_rustv

UPDATE: I have now removed the test from my CI to be able to dev

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions