HI, First of all thanks for that great job. I spend so much time trying to optimise recognition model with C# OnnxRuntime version, and still execution time so slow on GPU. With paddle inference it's is insanely fast. I used your library and everything works perfectly on windows, unmanaged memory consumption around 3gb per instance. However I experienced serious memory leak in linux with the same code.
Environment:
Ubuntu 18.4
Opencv 4.6.0 (Opencvsharpextern)
Cudnn 8.4
Cuda 11.6
Paddle inference lib: https://paddle-inference-lib.bj.bcebos.com/2.3.2/cxx_c/Linux/GPU/x86-64_gcc8.2_avx_mkl_cuda11.6_cudnn8.4.0-trt8.4.0.6/paddle_inference_c.tgz
Everything is working inside docker with image nvidia/cuda:11.6.0-cudnn8-runtime-ubuntu18.04
I had to spent sometime to make paddle inference working, it appears I had to make soft link to libcudnn.so and libcublas.so to /usr/lib to order it may run.
Now the case, everything works as fast as on windows, but after 100's of iterations it used entire memory 64gb and 32 gb of swap, and crush with outofmemory. I
Do you have an Idea where the issue lies? I read sources, checked pinned object and it seems to me everything should be find, all pined objects released.
HI, First of all thanks for that great job. I spend so much time trying to optimise recognition model with C# OnnxRuntime version, and still execution time so slow on GPU. With paddle inference it's is insanely fast. I used your library and everything works perfectly on windows, unmanaged memory consumption around 3gb per instance. However I experienced serious memory leak in linux with the same code.
Environment:
Ubuntu 18.4
Opencv 4.6.0 (Opencvsharpextern)
Cudnn 8.4
Cuda 11.6
Paddle inference lib: https://paddle-inference-lib.bj.bcebos.com/2.3.2/cxx_c/Linux/GPU/x86-64_gcc8.2_avx_mkl_cuda11.6_cudnn8.4.0-trt8.4.0.6/paddle_inference_c.tgz
Everything is working inside docker with image nvidia/cuda:11.6.0-cudnn8-runtime-ubuntu18.04
I had to spent sometime to make paddle inference working, it appears I had to make soft link to libcudnn.so and libcublas.so to /usr/lib to order it may run.
Now the case, everything works as fast as on windows, but after 100's of iterations it used entire memory 64gb and 32 gb of swap, and crush with outofmemory. I
Do you have an Idea where the issue lies? I read sources, checked pinned object and it seems to me everything should be find, all pined objects released.