#ijcai #computervision #artificialintelligence #deeplearning #flexible

𝗗𝗮𝘆-𝟰𝟴𝟲 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 C3-STISR: Scene Text Image Super-resolution with Triple Clues by Shanghai Key Lab of Intelligent Information Processing, China Follow me for a similar post: Ashish Patel ------------------------------------------------------------------- 𝗜𝗻𝘁𝗲𝗿𝗲𝘀𝘁𝗶𝗻𝗴 𝗙𝗮𝗰𝘁𝘀 : 🔸 This paper is published #IJCAI 2022. ------------------------------------------------------------------- 𝗜𝗠𝗣𝗢𝗥𝗧𝗔𝗡𝗖𝗘 👉 Scene text image super-resolution (STISR) has been regarded as an important pre-processing task for text recognition from low-resolution scene text images. 👉 Most recent approaches use the recognizer's feedback as clues to guide super-resolution. 👉 However, directly using recognition indication has two problems: 👉 1) Compatibility. It is in the form of a probability distribution, has an obvious modal gap with STISR - a pixel-level task; 👉 2) Inaccuracy. 👉 it usually contains wrong information, thus will mislead the main task and degrade super-resolution performance. 👉 In this paper, we present a novel method C3-STISR that jointly exploits the recognizer's feedback and visual and linguistical information as clues to guide super-resolution. 👉 Here, the visual clue is from the images of texts predicted by the recognizer, which is informative and more compatible with the STISR task; 👉 while linguistical clue is generated by a pre-trained character-level language model, which is able to correct the predicted texts. 👉 We design effective extraction and fusion mechanisms for the triple cross-modal clues to generate comprehensive and unified guidance for super-resolution. 👉 Extensive experiments on TextZoom show that C3-STISR outperforms the SOTA methods in infidelity and recognition performance. #computervision #artificialintelligence #deeplearning #flexible

3 Comments

Thom Ives, Ph.D. 3y

Wow Ashish. This seems significant for the future of improving OCR - right?

1 Reaction

LinkedIn respects your privacy

Ashish Patel 🇮🇳’s Post

More from this author

How I Read This Book on DeepSeek — And Where Each Chapter Actually Helped Me in the Real World

From Concept to Scalable LLM: Exploring the Power of Model Context Protocol

90% of Top Companies Are Implementing AI Agents—Don’t Get Left Behind

Explore content categories