Skip to content

G-API: oneVPL Add VPP IE preproc#21363

Closed
sivanov-work wants to merge 18 commits intoopencv:4.xfrom
sivanov-work:vpp_preproc
Closed

G-API: oneVPL Add VPP IE preproc#21363
sivanov-work wants to merge 18 commits intoopencv:4.xfrom
sivanov-work:vpp_preproc

Conversation

@sivanov-work
Copy link
Copy Markdown
Contributor

@sivanov-work sivanov-work commented Dec 29, 2021

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

Build Configuration

force_builders=XCustom,Custom Win,Custom Mac
build_gapi_standalone:Linux x64=ade-0.1.1f
build_gapi_standalone:Win64=ade-0.1.1f
build_gapi_standalone:Mac=ade-0.1.1f
build_gapi_standalone:Linux x64 Debug=ade-0.1.1f

Xbuild_image:Custom=centos:7
Xbuildworker:Custom=linux-1
build_gapi_standalone:Custom=ade-0.1.1f

build_image:Custom=ubuntu-openvino-2021.3.0:20.04
Xbuild_image:Custom Win=openvino-2021.2.0
build_image:Custom Mac=openvino-2021.2.0

test_modules:Custom=gapi,python2,python3,java
test_modules:Custom Win=gapi,python2,python3,java
test_modules:Custom Mac=gapi,python2,python3,java

buildworker:Custom=linux-1
test_opencl:Custom=OFF
test_bigdata:Custom=1
test_filter:Custom=*

build_image:Custom Win=gapi-onevpl-2021.6.0
buildworker:Custom Win=windows-3
build_contrib:Custom Win=OFF

@sivanov-work
Copy link
Copy Markdown
Contributor Author

@sivanov-work rebase on #21232 as soon as merged

@dmatveev dmatveev self-assigned this Jan 20, 2022
@dmatveev dmatveev added this to the 4.6.0 milestone Jan 20, 2022
@sivanov-work sivanov-work mentioned this pull request Jan 24, 2022
6 tasks
@sivanov-work sivanov-work marked this pull request as ready for review January 26, 2022 13:43
Copy link
Copy Markdown
Contributor

@mpashchenkov mpashchenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look again later.

out_keep_alive_frame);
*out_keep_alive_frame = frame;
}
} // oherwise it is not suitable frame, then check on other preproc backend or rely on IE plugin
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oherwise -> otherwise

"MFX_ERR_REALLOC_SURFACE is not processed");
break;
case MFX_WRN_IN_EXECUTION:
/*try {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for debugging purposes and should be removed

GAPI_LOG_DEBUG(nullptr, "VPP preprocessing for decoded frame will be used");

auto inputs = ctx.uu.net.getInputsInfo();
const auto &input_name = ctx.uu.params.input_names.at(0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name can be sent from Infer::run() function. But it will only be used here.

And 0->i, but you know it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorrect in generic case:

I. InferROI:
1 is for blob, 0 - for input

 IE::Blob::Ptr this_blob =
                            extractBlob(*ctx, 1, cv::gapi::ie::TraitAs::IMAGE,
                                        slot_ptr);
                        setROIBlob(req,
                                   *(ctx->uu.params.input_names.begin()),
                                   this_blob, this_roi, *ctx);

II. The same problem in InferList
III. the more complex in InferList2

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in retrospective vision: if was needed to modify setBlob instead extractBlob
I'll try to make some changes in setBlob and revert extractBlob changes


GSource::Priv::Priv() :
mfx_handle(MFXLoad()),
// mfx_handle(MFXLoad()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to do not forget uncomment it after VPL releases their bug fix

MFXDispReleaseImplDescription(mfx_handle, mfx_impl_description);
GAPI_LOG_INFO(nullptr, "Unload MFX handle: " << mfx_handle);
MFXUnload(mfx_handle);
//MFXUnload(mfx_handle);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M B it will be uncommented in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

truth

GAPI_Assert(ctx.inShape(i) == cv::GShape::GFRAME &&
"Remote blob is supported for MediaFrame only");
cv::MediaFrame frame = ctx.inFrame(i);
#ifdef HAVE_ONEVPL
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try remove defines

if(HAVE_GAPI_ONEVPL)
# NB: TARGET opencv_perf_gapi doesn't exist before `ocv_add_perf_tests`
if(TARGET opencv_perf_gapi)
target_include_directories(opencv_perf_gapi PRIVATE "${CMAKE_CURRENT_LIST_DIR}/src")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it's needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default include path doen't provide search for gapi/src/ path. but it require to turn on perf tests preprocengine with vpl source together

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you want to test from gapi/src ?
Honestly we do the same for opencv_test_gapi with HUGE comment, but it's better to avoid if it's possible

std::declval<std::tuple<resize_t>>()));
class OneVPLSourcePerf_PP_Test : public TestPerfParams<source_description_preproc_t> {};

PERF_TEST_P_(OneVPLSourcePerf_PP_Test, TestPerformance)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it measure? resize+crop?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crop - means extract roi from frame
why you asked?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just didn't get what this test actually do

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's performance test: collect absolute metrics for decode + preproc execution. It is used in comparison with perf test for decode only


cv::util::optional<pp_params> is_applicable(const cv::MediaFrame& in_frame) override;

pp_session initialize_preproc(const pp_params& params,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this preproc be unbind from OV API?

  • OV2.0 doesn't have InputInfo
  • It allows to get rid from linkage with OV in perf tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to implement custom wrap over InputInfo which would fully represent it functionality
But as you said that there OV2.0 is coming - it looks redundant work for me to spend some time to realize how this wrapper would be like.
Instead i would wait for OV2.0 outcome (when it officially released) and merge InputInfo and OV2.0 eventually together.
@TolyaTalamanov do you agree?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope that it will be unbind from IE specific classes, let's discuss this open offline with team

ocv_target_include_directories(opencv_perf_gapi SYSTEM PRIVATE ${OPENCL_INCLUDE_DIRS})
endif()

if(OPENCV_GAPI_INF_ENGINE)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely unnecessary to link perf tests with IE

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preproc uses IE network info to set up its parameters.

@TolyaTalamanov Have you any ideas how to overcome that in simplest way?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably don't build sample at all. @smirnov-alexey do you remember how to properly handle it?

#include <opencv2/gapi/streaming/cap.hpp>

#ifdef HAVE_INF_ENGINE
#include <inference_engine.hpp>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's exactly needed from inference_engine.hpp ? I guess InputsInfo, then see comments below

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is

if (out_keep_alive_frame != nullptr) {
GAPI_LOG_DEBUG(nullptr, "remember preprocessed frame to keep it busy from reuse, slot: " <<
out_keep_alive_frame);
*out_keep_alive_frame = frame;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, but you assigned it to frame which is already located ctx, so it shouldn't be destroyed, should it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct me if i didn't understand and and confusing you with my answer:
I allocate empty MediaFrame as a slot in ctx beforehand in req.execute because i must retain frame per-request in general way
Here, i reuse slot with emptyMediaFrame and put my new preprocessed frame in this slot by copying. It is fully allowable because MediaFrame contains shared pointer on its adapter which ensure guarantee that MediaAdapter would not be destroyed until last link in slot is alive

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was, why do you need extra out_keep_alive_frame if you already assigned new pre-processed frame to context ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context may serve multiple requests in general way: but it require to distinguish which request must destroy particular frame. Thus new additional pair of function were introduces to satisfy "minimal code changes" purposes

#ifdef HAVE_ONEVPL
#ifdef HAVE_DIRECTX
#ifdef HAVE_D3D11
if (ctx.uu.prepoc_engine_impl) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition of enabling preprocessing isn't clear again.

Preprocessing should be enabled in case:

  • User specified GPU OV plugin
  • Frame came from VPL Source (has VPL adapter)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it is

User specified GPU OV plugin

see https://github.com/opencv/opencv/pull/21363/files#diff-437e5530bca341dae63e89a5b62027702b1b4ee3d954e99d490b24a6127a2350R253
for creation. VA_DEVICE -- means GPU plugin

and

Frame came from VPL Source (has VPL adapter)

Applicable condition prepoc_engine_impl->is_applicable(frame)

Copy link
Copy Markdown
Contributor

@TolyaTalamanov TolyaTalamanov Feb 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for creation. VA_DEVICE -- means GPU plugin

It should something more explicit

Applicable condition prepoc_engine_impl->is_applicable(frame)

But firstly we have to make sure that frame comes with appropriate media adapter

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should something more explicit

My aim in implementing such functionality was "minimal user query" and "maximum automatization"
If user had passed param VA_DEVICE already then it was clear purpose to use such device in InferRemoteCtx and prepare PP context

But firstly we have to make sure that frame comes with appropriate media adapter

is_applicable makes checking out on appropriate media adapter and decides: is that MediaFrame could be preprocessed by existing PP engine or not

std::size_t i,
cv::gapi::ie::TraitAs hint) {
cv::gapi::ie::TraitAs hint,
cv::MediaFrame* out_keep_alive_frame = nullptr) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it's needed as separate function parameter? You can put frame into context

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can put frame into context

i need key which identifying request which i must hold frame. Because finally I got PostCallback which identifies request but not frame to deallocate

To simplify & improve logic we must refactore each part of existing extracBlob and it's neighborhoods. But current implementation carries out minimal changes approach and implement fast-fix solution

{"VA_DEVICE", accel_device_ptr} });

{"VA_DEVICE", accel_device_ptr},
/*TODO turn on auto vpp preproc*/ {"GAPI_DEVICE_CTX", accel_ctx_ptr} });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InferenceEngine::ParamMap shouldn't contain information not related with IE plugins such as GAPI_DEVICE_CTX

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree,
It just a prototyped solution. There were discussion to use these pointer as a part of cv::gapi::ie::Params and planned as this PR refactoring
@TolyaTalamanov Do you agree?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must discussed offline with team

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous discussion finished with:

cv::gapi::ie::Params add method cfg_device_contet(device_ptr, context_ptr) to hold on device & contrxt ptr, which should be used in giebackend as remote ctx & preproc engine creation

rctx = ie_core.CreateContext(params.device_id, *ctx_params);

// Guess additional parameters is enough for creating preproc engine
auto dev_param_it = ctx_params->find("VA_DEVICE");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's IE::ParamMap content, why it's handled in our code???

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IE::ParamMap is intended to pass specific device into OV to make a remote context
Because we need in specific device too - i REused it here too to avoid produce new redundant entities

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss it offline as well

@TolyaTalamanov
Copy link
Copy Markdown
Contributor

Open questions

  • When VPL preprocessing should be enabled?
    AT: User set onevpl::GSource and specified GPU plugin for cv::gapi::ie::Params.
  • What is needed for instance preprocessing engine?
  • Which backends/platforms should it cover?
  • How it looks like for user?

acceleration_policy->init(mfx_vpp_session);
try {
// ask to allocate external memory pool
mfxFrameAllocRequest vppRequests[2];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have lost thread. Why is 2?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is MFX/VPL API: MFXVideoVPP_QueryIOSurf returns array of 2 elements

it is possible to add naming variable...

throw std::runtime_error("Cannot execute MFXVideoVPP_QueryIOSurf");
}

// NB: Assing ID as upper limit descendant to distinguish specific VPP allocation
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assing -->Assign

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

// The decoder detected a new sequence header in the bitstream.
// Video parameters may have changed.
// In external memory allocation case, might need to reallocate the output surface
/*GAPI_DbgAssert(false && "VPLLegacyDecodeEngine::process_error - "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be uncommented?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(in decode_engine_legacy.cpp too)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't know how to process this WRN because it looks like doesn't affect on anything and happens spontaneously in the beginning of stream
I suppose we need to add logger here and make it falls through

mfxSession mfx_vpp_session {nullptr};
sts = MFXCloneSession(mfx_decode_session, &mfx_vpp_session);
ASSERT_EQ(MFX_ERR_NONE, sts);
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it work?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be deleted

}
} while (MFX_ERR_NONE == sess.last_status && !my_sess.vpp_out_queue.empty());
return ExecutionStatus::Continue;
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "Wait for ASYNC decode result" lambda is big and also common with VPLLegacyDecodeEngine - should it be shared?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea is the same but it's different is particular places:
different queue, different surfaces and different on_frame_ready

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, OK..
I believe the differences can be handled by introducing some using redefines (like session_type you did in PreprocEngine) - and referring to the different types and methods via this->session_type and this->on_frame_ready.
So, as this can be LegacyDecodeEngine or PreprocEngine, the lambda will automatically switch between usings and methods.

I still think it is useful to share the code between classes, for better understanding and sometimes easier maintenance.
But maybe it's just not for now :)
Can this refactoring be planned for the future?

// The decoder detected a new sequence header in the bitstream.
// Video parameters may have changed.
// In external memory allocation case, might need to reallocate the output surface
/*GAPI_DbgAssert(false && "VPLLegacyDecodeEngine::process_error - "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(in decode_engine_legacy.cpp too)

@sivanov-work
Copy link
Copy Markdown
Contributor Author

It was decided to separate the PR on two parts:
-PP core: #21618
-and further integration part

current PR would be closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants