Conversation
cf6fbaa to
5d46532
Compare
1d2404c to
609a3ed
Compare
99aefb8 to
03e470e
Compare
e0a9985 to
2f87aa4
Compare
c93b5f5 to
5db0155
Compare
5db0155 to
008ad44
Compare
008ad44 to
979c64f
Compare
modules/objdetect/src/qrcode.cpp
Outdated
| { | ||
| result_info += qr_code_data.payload[i]; | ||
| } | ||
| if (qr_code_data.data_type == QUIRC_DATA_TYPE_BYTE && !checkUTF8(result_info)) { |
There was a problem hiding this comment.
data type check should go before the first loop on the line 2801.
Do we really need checkUTF8? Which test cases fail without it?
There was a problem hiding this comment.
A QR code from #23728 is created in Bytes mode but the sequence is not UTF-8 (probably, decoded just a raw bytes array of the unicode string):
qr code content (qr_code_data.payload):
83, 80, 67, 13, 10, 48, 50, 48, 48, 13, 10, 49, 13, 10, 67, 72, 48, 52, 51, 48, 48, 48, 53, 50, 51, 48, 50, 50, 50, 52, 52, 57, 48, 49, 72, 13, 10, 83, 13, 10, 69, 109, 105, 108, 32, 70, 114, 101, 121, 32, 66, 101, 116, 114, 105, 101, 98, 115, 32, 65, 71, 13, 10, 66, 97, 104, 110, 104, 111, 102, 115, 116, 114, 97, 115, 115, 101, 32, 49, 55, 13, 10, 13, 10, 53, 55, 52, 53, 13, 10, 83, 97, 102, 101, 110, 119, 105, 108, 13, 10, 67, 72, 13, 10, 13, 10, 13, 10, 13, 10, 13, 10, 13, 10, 13, 10, 13, 10, 51, 50, 50, 56, 46, 53, 48, 13, 10, 67, 72, 70, 13, 10, 83, 13, 10, 83, 105, 120, 116, 32, 114, 101, 110, 116, 32, 97, 32, 67, 97, 114, 32, 65, 71, 32, 13, 10, 77, 252, 108, 108, 104, 101, 105, 109, 115, 116, 114, 97, 115, 115, 101, 32, 49, 57, 53, 13, 10, 13, 10, 52, 48, 53, 55, 13, 10, 66, 97, 115, 101, 108, 13, 10, 67, 72, 13, 10, 81, 82, 82, 13, 10, 50, 54, 55, 50, 55, 52, 48, 51, 53, 56, 49, 48, 49, 48, 52, 56, 51, 48, 48, 48, 57, 54, 51, 57, 52, 51, 48, 13, 10, 75, 100, 110, 114, 32, 57, 54, 51, 57, 52, 51, 44, 32, 48, 51, 53, 56, 49, 45, 48, 49, 48, 52, 56, 51, 48, 13, 10, 69, 80, 68,
byte array of the text:
text = u"""
SPC
0200
1
CH043000523022244901H
S
Emil Frey Betriebs AG
Bahnhofstrasse 17
5745
Safenwil
CH
3228.50
CHF
S
Sixt rent a Car AG
Müllheimstrasse 195
4057
Basel
CH
QRR
267274035810104830009639430
Kdnr 963943, 03581-0104830
EPD
"""
print([int(v) for v in bytearray(text.encode('ISO-8859-1'))])[83, 80, 67, 10, 48, 50, 48, 48, 10, 49, 10, 67, 72, 48, 52, 51, 48, 48, 48, 53, 50, 51, 48, 50, 50, 50, 52, 52, 57, 48, 49, 72, 10, 83, 10, 69, 109, 105, 108, 32, 70, 114, 101, 121, 32, 66, 101, 116, 114, 105, 101, 98, 115, 32, 65, 71, 10, 66, 97, 104, 110, 104, 111, 102, 115, 116, 114, 97, 115, 115, 101, 32, 49, 55, 10, 10, 53, 55, 52, 53, 10, 83, 97, 102, 101, 110, 119, 105, 108, 10, 67, 72, 10, 10, 10, 10, 10, 10, 10, 10, 51, 50, 50, 56, 46, 53, 48, 10, 67, 72, 70, 10, 83, 10, 83, 105, 120, 116, 32, 114, 101, 110, 116, 32, 97, 32, 67, 97, 114, 32, 65, 71, 10, 77, 252, 108, 108, 104, 101, 105, 109, 115, 116, 114, 97, 115, 115, 101, 32, 49, 57, 53, 10, 10, 52, 48, 53, 55, 10, 66, 97, 115, 101, 108, 10, 67, 72, 10, 81, 82, 82, 10, 50, 54, 55, 50, 55, 52, 48, 51, 53, 56, 49, 48, 49, 48, 52, 56, 51, 48, 48, 48, 57, 54, 51, 57, 52, 51, 48, 10, 75, 100, 110, 114, 32, 57, 54, 51, 57, 52, 51, 44, 32, 48, 51, 53, 56, 49, 45, 48, 49, 48, 52, 56, 51, 48, 10, 69, 80, 68, 10]
However, the UTF-8 byte array is different:
print([int(v) for v in bytearray(text.encode('UTF-8'))])[83, 80, 67, 10, 48, 50, 48, 48, 10, 49, 10, 67, 72, 48, 52, 51, 48, 48, 48, 53, 50, 51, 48, 50, 50, 50, 52, 52, 57, 48, 49, 72, 10, 83, 10, 69, 109, 105, 108, 32, 70, 114, 101, 121, 32, 66, 101, 116, 114, 105, 101, 98, 115, 32, 65, 71, 10, 66, 97, 104, 110, 104, 111, 102, 115, 116, 114, 97, 115, 115, 101, 32, 49, 55, 10, 10, 53, 55, 52, 53, 10, 83, 97, 102, 101, 110, 119, 105, 108, 10, 67, 72, 10, 10, 10, 10, 10, 10, 10, 10, 51, 50, 50, 56, 46, 53, 48, 10, 67, 72, 70, 10, 83, 10, 83, 105, 120, 116, 32, 114, 101, 110, 116, 32, 97, 32, 67, 97, 114, 32, 65, 71, 10, 77, 195, 188, 108, 108, 104, 101, 105, 109, 115, 116, 114, 97, 115, 115, 101, 32, 49, 57, 53, 10, 10, 52, 48, 53, 55, 10, 66, 97, 115, 101, 108, 10, 67, 72, 10, 81, 82, 82, 10, 50, 54, 55, 50, 55, 52, 48, 51, 53, 56, 49, 48, 49, 48, 52, 56, 51, 48, 48, 48, 57, 54, 51, 57, 52, 51, 48, 10, 75, 100, 110, 114, 32, 57, 54, 51, 57, 52, 51, 44, 32, 48, 51, 53, 56, 49, 45, 48, 49, 48, 52, 56, 51, 48, 10, 69, 80, 68, 10]
There is a statement in the ISO that storing bytes array is generally fine, but the encoding step is up to user (alternative is to create a QR code in ECI mode which keeps an info about the encoding standard, but seems like not all the generators propose it):
In closed-system national or application-specific implementations of QR Code, an alternative 8-bit character set, for example as defined in an appropriate part of ISO/IEC 8859, may be specified for Byte mode. When an alternative character set is specified, however, the parties intending to read the QR Code 2005 symbols require to be notified of the applicable character set in the application specification or by bilateral agreement.
There was a problem hiding this comment.
According to our docstring, OpenCV should return result in UTF-8 format:
There was a problem hiding this comment.
@opencv-alalek, perhaps I misunderstood the question. Do you mean can we apply encoding right in the loop, without checkUTF8 method?
There was a problem hiding this comment.
Without checkUTF8 failed tests are:
[ FAILED ] Objdetect_QRCode.regression/24, where GetParam() = "russian.jpg"
[ FAILED ] Objdetect_QRCode.regression/25, where GetParam() = "kanji.jpg"
[ FAILED ] Objdetect_QRCode_Multi.regression/6, where GetParam() = ("4_qrcodes.png", "contours_based")
[ FAILED ] Objdetect_QRCode_Multi.regression/7, where GetParam() = ("4_qrcodes.png", "aruco_based")
[ FAILED ] Objdetect_QRCode_Multi.regression/8, where GetParam() = ("5_qrcodes.png", "contours_based")
[ FAILED ] Objdetect_QRCode_Multi.regression/12, where GetParam() = ("7_qrcodes.png", "contours_based")
[ FAILED ] Objdetect_QRCode_Multi.regression/13, where GetParam() = ("7_qrcodes.png", "aruco_based")
There was a problem hiding this comment.
There is completely missing code for proper handling of data type and ECI information.
Detail: https://en.wikipedia.org/wiki/Extended_Channel_Interpretation
Unfortunately it requires sometimes code-page maps.
P.S. Kanji is not properly handled (UTF-8 conversion is still required)
There was a problem hiding this comment.
Agree, I wanted to take a look later to Kanji test too.
5fb98ba to
c0aaad8
Compare
c0aaad8 to
f79bf88
Compare
opencv-alalek
left a comment
There was a problem hiding this comment.
API should be extended to return metadata (ECI) for decoded streams.
| } | ||
| result_info.assign((const char*)qr_code_data.payload, qr_code_data.payload_len); | ||
| } else if (qr_code_data.eci == 25/*ECI_UTF_16BE*/) { | ||
| CV_LOG_INFO(NULL, "QR: UTF-16BE ECI is not supported"); |
There was a problem hiding this comment.
I propose to make it CV_LOG_WARING. INFO is not printed in regular builds.
There was a problem hiding this comment.
We should not spam with that message. QR detector is usually called for each frame.
Encode QR code data to UTF-8 opencv#24350 ### Pull Request Readiness Checklist **Merge with extra**: opencv/opencv_extra#1105 resolves opencv#23728 This is first PR in a series. Here we just return a raw Unicode. Later I will try expand QR codes decoding methods to use ECI assignment number and return a string with proper encoding, not only UTF-8 or raw unicode. See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
Encode QR code data to UTF-8 opencv#24350 ### Pull Request Readiness Checklist **Merge with extra**: opencv/opencv_extra#1105 resolves opencv#23728 This is first PR in a series. Here we just return a raw Unicode. Later I will try expand QR codes decoding methods to use ECI assignment number and return a string with proper encoding, not only UTF-8 or raw unicode. See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
Encode QR code data to UTF-8 opencv#24350 ### Pull Request Readiness Checklist **Merge with extra**: opencv/opencv_extra#1105 resolves opencv#23728 This is first PR in a series. Here we just return a raw Unicode. Later I will try expand QR codes decoding methods to use ECI assignment number and return a string with proper encoding, not only UTF-8 or raw unicode. See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
Consider QRCode ECI encoding #24426 ### Pull Request Readiness Checklist related: #24350 (review) 1. Add `getEncoding` method to obtain ECI number 2. Add `detectAndDecodeBytes`, `decodeBytes`, `decodeBytesMulti`, `detectAndDecodeBytesMulti` methods in Python (return `bytes`) and Java (return `byte[]`) 3. Allow Python bytes to std::string conversion in general and add `encode(byte[] encoded_info, Mat qrcode)` in Java Python example with Kanji encoding: ```python img = cv.imread("test.png") detect = cv.QRCodeDetector() data, points, straight_qrcode = detect.detectAndDecodeBytes(img) print(data) print(detect.getEncoding(), cv.QRCodeEncoder_ECI_SHIFT_JIS) print(data.decode("shift-jis")) ``` ``` b'\x82\xb1\x82\xf1\x82\xc9\x82\xbf\x82\xcd\x90\xa2\x8aE' 20 20 こんにちは世界 ``` source: https://github.com/opencv/opencv/blob/ba4d6c859d21536f84e0328c16f4cc3e96bf3065/modules/objdetect/test/test_qrcode_encode.cpp#L332  See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
Consider QRCode ECI encoding opencv#24426 ### Pull Request Readiness Checklist related: opencv#24350 (review) 1. Add `getEncoding` method to obtain ECI number 2. Add `detectAndDecodeBytes`, `decodeBytes`, `decodeBytesMulti`, `detectAndDecodeBytesMulti` methods in Python (return `bytes`) and Java (return `byte[]`) 3. Allow Python bytes to std::string conversion in general and add `encode(byte[] encoded_info, Mat qrcode)` in Java Python example with Kanji encoding: ```python img = cv.imread("test.png") detect = cv.QRCodeDetector() data, points, straight_qrcode = detect.detectAndDecodeBytes(img) print(data) print(detect.getEncoding(), cv.QRCodeEncoder_ECI_SHIFT_JIS) print(data.decode("shift-jis")) ``` ``` b'\x82\xb1\x82\xf1\x82\xc9\x82\xbf\x82\xcd\x90\xa2\x8aE' 20 20 こんにちは世界 ``` source: https://github.com/opencv/opencv/blob/ba4d6c859d21536f84e0328c16f4cc3e96bf3065/modules/objdetect/test/test_qrcode_encode.cpp#L332  See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
Pull Request Readiness Checklist
Merge with extra: opencv/opencv_extra#1105
resolves #23728
This is first PR in a series. Here we just return a raw Unicode. Later I will try expand QR codes decoding methods to use ECI assignment number and return a string with proper encoding, not only UTF-8 or raw unicode.
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.