Optimizations for precision conversion operations in nGraph reference implementations#3974
Conversation
a3f6ec7 to
0896d9e
Compare
4027bf2 to
3414af4
Compare
85c7c33 to
23e4f88
Compare
4e3db8b to
e1552e8
Compare
GlebKazantaev
left a comment
There was a problem hiding this comment.
Transformations part looks good,
e1552e8 to
2e1c6b1
Compare
| for (size_t i = 0; i < size; ++i) { | ||
| dst_data[i] = convert_value<src_type, dst_type>(src_data[i]); | ||
| for (size_t i = 0; i < size; ++i) { | ||
| dst_data[i] = convert_value<src_type, dst_type>(src_data[i]); |
There was a problem hiding this comment.
Why cannot I always use the reference implementation?
I thought that reference implementation should support any precisions
There was a problem hiding this comment.
The behavior of two convert operations is slightly different. The nGraph conversion operation does not check upper and lower bounds. In my cases, for int8->fp16 and fp16->fp32 conversions, I can use nGraph implementations as-is, because the target type values range is wider than the source.
There was a problem hiding this comment.
May be we should align conversion operations in Graph and transformations library.
| @@ -0,0 +1,273 @@ | |||
| //***************************************************************************** | |||
| // Copyright 2017-2020 Intel Corporation | |||
There was a problem hiding this comment.
please fix in other files as well
2e1c6b1 to
8da0575
Compare
…onymous namespace
257c35b to
5dbf375
Compare
5dbf375 to
b1bdd55
Compare
Data precision conversion operations take up a significant part of the load time. On average ConvertPrecision transformation pass takes 34% of load time for FP16 models. This fix significantly reduce time of int8->fp16 and fp16->fp32 conversions.