𝗗𝗮𝘆-𝟰𝟴𝟲 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 C3-STISR: Scene Text Image Super-resolution with Triple Clues by Shanghai Key Lab of Intelligent Information Processing, China Follow me for a similar post: Ashish Patel ------------------------------------------------------------------- 𝗜𝗻𝘁𝗲𝗿𝗲𝘀𝘁𝗶𝗻𝗴 𝗙𝗮𝗰𝘁𝘀 : 🔸 This paper is published #IJCAI 2022. ------------------------------------------------------------------- 𝗜𝗠𝗣𝗢𝗥𝗧𝗔𝗡𝗖𝗘 👉 Scene text image super-resolution (STISR) has been regarded as an important pre-processing task for text recognition from low-resolution scene text images. 👉 Most recent approaches use the recognizer's feedback as clues to guide super-resolution. 👉 However, directly using recognition indication has two problems: 👉 1) Compatibility. It is in the form of a probability distribution, has an obvious modal gap with STISR - a pixel-level task; 👉 2) Inaccuracy. 👉 it usually contains wrong information, thus will mislead the main task and degrade super-resolution performance. 👉 In this paper, we present a novel method C3-STISR that jointly exploits the recognizer's feedback and visual and linguistical information as clues to guide super-resolution. 👉 Here, the visual clue is from the images of texts predicted by the recognizer, which is informative and more compatible with the STISR task; 👉 while linguistical clue is generated by a pre-trained character-level language model, which is able to correct the predicted texts. 👉 We design effective extraction and fusion mechanisms for the triple cross-modal clues to generate comprehensive and unified guidance for super-resolution. 👉 Extensive experiments on TextZoom show that C3-STISR outperforms the SOTA methods in infidelity and recognition performance. #computervision #artificialintelligence #deeplearning #flexible
Wow Ashish. This seems significant for the future of improving OCR - right?