I'm compiling latest OpenCV with OpenVino 2021.1 (same issue with at least 2020.4) and when calling net.forward() with a new input size for the first time, it can take up to 40sec when regulare inference only takes <0.2sec.
Further forward are fast since OpenVino will "cache" whatever it generated.
This only happens on Intel GPUs, everything is ok on CPUs
My model is a CNN (think Unet) and I can share it by email if needed.