Split up layer dispatch of Tensorflow importer into map of functions by rogday · Pull Request #20190 · opencv/opencv

rogday · 2021-06-01T14:19:56Z

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
The PR is proposed to proper branch
There is reference to original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

alalek · 2021-06-04T11:13:32Z

modules/dnn/src/tensorflow/tf_importer.cpp

@@ -34,6 +34,8 @@ CV__DNN_EXPERIMENTAL_NS_BEGIN

 #if HAVE_PROTOBUF

+#define CALL_MEMBER_FN(object,ptrToMember)  ((object).*(ptrToMember))


This macro is used once. Lets remove them and use code directly.

alalek · 2021-06-04T11:15:06Z

modules/dnn/src/tensorflow/tf_importer.cpp

+    static DispatchMap dispatch;
+    static DispatchMap initDispatch();
+
+    void convolution        (tensorflow::GraphDef& net, tensorflow::NodeDef& layer, LayerParams& layerParams);


tensorflow::GraphDef& net

It makes sense to remove this parameter and use field from the class directly (through this) instead.

In onnx importer there is a diagnostic run and if it is enabled, the other net is used instead, so I left a parameter here to be able to do the same.

What is the problem to change field value for that case?

opencv/modules/dnn/src/onnx/onnx_importer.cpp

Lines 115 to 117 in 3e513ee

protected:

Net& dstNet;

Net utilNet;

Net field is a reference itself, so we could make a self-referential struct, but the reference will become dangling when the object will be moved.

alalek · 2021-06-04T11:18:06Z

modules/dnn/src/tensorflow/tf_importer.cpp

+            DispatchMap;
+
+    static DispatchMap dispatch;
+    static DispatchMap initDispatch();


Avoid global fields which require initialization. Allocate/initialize on-demand once.

static DispatchMap& getDispatchMap();

with

DispatchMap& TFImporter::getDispatchMap() { static DispatchMap g_dispatchMap; if (g_dispatchMap.empty()) { AutoLock ...; if (g_dispatchMap.empty()) { ... } } return g_dispatchMap; }

alalek · 2021-06-04T11:19:52Z

modules/dnn/src/tensorflow/tf_importer.cpp

+    std::string name = layer.name();
+    std::string type = layer.op();
+    int num_inputs = layer.input_size();


BTW, many handlers have such "prolog" code.

alalek · 2021-06-04T11:23:35Z

modules/dnn/src/tensorflow/tf_importer.cpp

+
+void TFImporter::parseNode(const tensorflow::NodeDef& layer_)
+{
+    tensorflow::NodeDef layer = layer_;


Probably this "copy" is not needed anymore.

vpisarev · 2021-06-08T13:39:44Z

@rogday, thank you for the patch.

one general question and one general request.

the question. Could you please list the reasons why is this refactoring done. I mean, we discussed it briefly during the core team meeting and we found some pro's and con's of such approach:

pro's:

parsing each layer info is more atomic
maybe parsers of new layers can be added externally (even though you explicitly use pointers to methods rather than functions).
the speed is a bit higher because the efficient O(1) or O(logN) search in dictionary is used instead of linear search across the supported layers. However, disk I/O is probably the main bottleneck that limits the import speed.

con's:

there is more code
in each and every method the same prologue is used, which can be avoided in the case a single if-elseif cascade. More generally, if some layers are similar, they can share some parts, whereas in the new approach all this code needs to be duplicated. Together with the code duplication comes the satellite problem of consistency of the updates. If we need to add or change some common preprocessing in layer parsing, we'd need to replicate these changes in each single method.

Maybe I missed something. If so, can you please put your thoughts?

the request. If we decide to keep this structure, I think, the methods need to be renamed to include some common prefix, e.g. parse. That is, instead of "concat", the method should possibly be named "parseConcat". It will help to avoid possible name conflicts if the different parsers API needs to be extended with some functionality that this functionality will conflict with a layer name.

rogday · 2021-06-08T15:36:59Z

@vpisarev, thank you for the review.

This refactoring is done because I plan to implement TF importer diagnostic similar to ONNX one. In the latter each layer is duplicated in registered set(which is used for diagnostic reasons) and in if-elseif cascade, which makes it easy to forget to add the layer everywhere. It's also more structured this way and can be navigated more easily. The prologue can be inlined since its just introducing constant reference most of the time, and it is pretty short. If the layers share common parts, these parts can be turned into functions themselves. Or it can be done just like in convolution function which branches based on the type of the layer.

vpisarev · 2021-06-09T07:51:25Z

@rogday, thank you. It's good enough explanation for me.

vpisarev

please, rename the methods, add "parse" or some other prefix

vpisarev

👍

rogday · 2021-06-09T15:05:48Z

modules/dnn/src/tensorflow/tf_importer.cpp

-        ReadTFNetParamsFromTextFileOrDie(config, &netTxt);
+        cv::AutoLock lock(getInitializationMutex());
+        if (instance == NULL)
+            instance = new Mutex();


Borrowed from dnn.cpp. I'm not sure this is 100% safe. https://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf, page 8. Mutex has an int refcounter which is being set in the constructor and it might get rearranged.

alalek · 2021-06-10T14:19:50Z

modules/dnn/src/tensorflow/tf_importer.cpp

+        cv::AutoLock lock(getDispatchMapMutex());
+        if (dispatch == NULL)
+        {
+            dispatch = &getDispatchMapImpl();


Why do need both getDispatchMap/getDispatchMapImpl?

static bool initialized = false; static DispatchMap dispatch; if (!initialized) { AutoLock lock(getInitializationMutex()); if (initialized) return dispatch; dispatch["Conv2D"] = ... ... initialized = true; } return dispatch;

I might be wrong, but I think the compiler is free to rearrange dispatch[layerName] = func with initialized = true hence the Impl function. And initialized should be atomic, also the default constructor of dispatchcan be ran by two threads at once. I think my solution is also not perfect since dispatch pointer isn't atomic either.

the default constructor of dispatch can be ran by two threads at once

C++11 doesn't have that problem. But unfortunatelly this patch is targeted for 3.4 with C++98. So yes, this is the problem.

We can use this (I will cleanup C++98 code during the merge into master branch):

{ static bool initialized = false; #ifdef CV_CXX11 static DispatchMap dispatch; #else static DispatchMap* pDispatch = NULL; #endif if (!initialized) { AutoLock lock(getInitializationMutex()); #ifndef CV_CXX11 static DispatchMap dispatch; pDispatch = &dispatch; #endif if (initialized) return dispatch; dispatch["Conv2D"] = ... ... initialized = true; } #ifdef CV_CXX11 return dispatch; #else return *pDispatch; #endif }

There is no guarantee pDispatch is not NULL when initialized is true, because memory may be partially out of sync across different threads. I'd like to force pDispatch to update when initialized is read from, but it is only possible via release-acquire semantics of atomic's loads and stores. It means that initialized has to be an equivalent of atomic<bool>, which we don't have. Another way is to lock the mutex prior to reading initialized.

Ok, what is about to simplify C++11 code path?

const TFImporter::DispatchMap& TFImporter::getDispatchMap() { +#ifdef CV_CXX11 + static DispatchMap& dispatch = getDispatchMapImpl(); + return dispatch; +#else static const TFImporter::DispatchMap* volatile dispatch = NULL; if (dispatch == NULL) { cv::AutoLock lock(getDispatchMapMutex()); if (dispatch == NULL) { dispatch = &getDispatchMapImpl(); } } return *dispatch; +#endif }

alalek · 2021-06-10T14:39:17Z

modules/dnn/src/tensorflow/tf_importer.cpp

@@ -510,2051 +510,2301 @@ class TFImporter

 private:
    void addPermuteLayer(const int* order, const std::string& permName, Pin& inpId);
+
+    typedef std::map<std::string, void (TFImporter::*)(tensorflow::GraphDef&, const tensorflow::NodeDef&, LayerParams&)>
+            DispatchMap;


typedef void (TFImporter::*TFImporterNodeParser)(tensorflow::GraphDef&, const tensorflow::NodeDef&, LayerParams&); typedef std::map<std::string, TFImporterNodeParser> DispatchMap;

alalek · 2021-06-10T14:41:40Z

modules/dnn/src/tensorflow/tf_importer.cpp

 };

-TFImporter::TFImporter(Net& net, const char *model, const char *config)
-    : dstNet(net)
+Mutex& getDispatchMapMutex()


Please reuse Mutex& getInitializationMutex(); instead (defined in dnn_common.hpp, already included)

vpisarev · 2021-06-11T11:48:44Z

@alalek, together with @rogday we briefly discussed this PR during the meeting. How about to create the map of the layer names and the parsing functions dynamically, each time a model is loaded? It's super-little overhead both in terms of consumed memory (probably just a few Kb) and time (I guess, far below <1ms). Such a solution will completely eliminate the need to use mutex and worry about proper deallocation of the map at the end.

alalek · 2021-06-11T20:04:28Z

This makes sense. Also this would help to configure possible custom handlers in flexible way.

rogday force-pushed the tf_importer_ref branch 3 times, most recently from 23a039a to c0c9dec Compare June 1, 2021 14:56

asmorkalov requested review from alalek and dkurt June 2, 2021 11:02

alalek reviewed Jun 4, 2021

View reviewed changes

vpisarev self-requested a review June 8, 2021 13:39

vpisarev approved these changes Jun 9, 2021

View reviewed changes

vpisarev self-requested a review June 9, 2021 07:53

vpisarev requested changes Jun 9, 2021

View reviewed changes

rogday force-pushed the tf_importer_ref branch 2 times, most recently from b1a1d40 to 0106a9c Compare June 9, 2021 13:51

vpisarev self-requested a review June 9, 2021 14:42

vpisarev reviewed Jun 9, 2021

View reviewed changes

rogday commented Jun 9, 2021

View reviewed changes

vpisarev self-requested a review June 10, 2021 14:07

vpisarev approved these changes Jun 10, 2021

View reviewed changes

vpisarev requested a review from alalek June 10, 2021 14:08

alalek reviewed Jun 10, 2021

View reviewed changes

split if into map of functions

7ee1816

rogday force-pushed the tf_importer_ref branch from 0106a9c to 7ee1816 Compare June 11, 2021 10:22

alalek assigned vpisarev Jun 11, 2021

opencv-pushbot merged commit c1adbe3 into opencv:3.4 Jun 11, 2021

alalek mentioned this pull request Jun 19, 2021

(4.x) Merge 3.4 #20282

Merged

rogday deleted the tf_importer_ref branch October 7, 2021 13:15

alalek mentioned this pull request Oct 15, 2021

(5.x) Merge 4.x #20886

Merged

		@@ -34,6 +34,8 @@ CV__DNN_EXPERIMENTAL_NS_BEGIN

		#if HAVE_PROTOBUF

		#define CALL_MEMBER_FN(object,ptrToMember) ((object).*(ptrToMember))

Uh oh!

Conversation

rogday commented Jun 1, 2021

Pull Request Readiness Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vpisarev commented Jun 8, 2021

Uh oh!

rogday commented Jun 8, 2021

Uh oh!

vpisarev commented Jun 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vpisarev left a comment

Choose a reason for hiding this comment

Uh oh!

vpisarev left a comment

Choose a reason for hiding this comment

Uh oh!

rogday Jun 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alalek Jun 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vpisarev commented Jun 11, 2021

Uh oh!

alalek commented Jun 11, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vpisarev commented Jun 9, 2021 •

edited

Loading

rogday Jun 9, 2021 •

edited

Loading

alalek Jun 10, 2021 •

edited

Loading