{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:07:43Z","timestamp":1760144863623,"version":"build-2065373602"},"reference-count":45,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2024,5,28]],"date-time":"2024-05-28T00:00:00Z","timestamp":1716854400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea (NRF)","doi-asserted-by":"publisher","award":["2020R1I1A3060675"],"award-info":[{"award-number":["2020R1I1A3060675"]}],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"publisher"}]},{"name":"KOREATECH","award":["2020R1I1A3060675"],"award-info":[{"award-number":["2020R1I1A3060675"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>If we visit famous and iconic landmarks, we may want to take a photo of them. However, such sites are usually crowded, and taking photos with only landmarks without people could be challenging. This paper aims to automatically remove people in a picture and produce a natural image of the landmark alone. To this end, it presents Thanos, a system to generate authentic human-removed images in crowded places. It is designed to produce high-quality images with reasonable computation cost using short video clips of a few seconds. For this purpose, a multi-frame-based recovery region minimization method is proposed. The key idea is to aggregate information partially available from multiple image frames to minimize the area to be restored. The evaluation result presents that the proposed method outperforms alternatives; it shows lower Fr\u00e9chet Inception Distance (FID) scores with comparable processing latency. It is also shown that the images by Thanos achieve a lower FID score than those of existing applications; Thanos\u2019s score is 242.8, while those by Retouch-photos and Samsung object eraser are 249.4 and 271.2, respectively.<\/jats:p>","DOI":"10.3390\/s24113486","type":"journal-article","created":{"date-parts":[[2024,5,28]],"date-time":"2024-05-28T13:32:55Z","timestamp":1716903175000},"page":"3486","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Towards Generating Authentic Human-Removed Pictures in Crowded Places Using a Few-Second Video"],"prefix":"10.3390","volume":"24","author":[{"given":"Juhwan","family":"Lee","sequence":"first","affiliation":[{"name":"Lululab Inc., Seoul 06054, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-2762-4137","authenticated-orcid":false,"given":"Euihyeok","family":"Lee","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Graduate School, Korea University of Technology and Education, Cheonan 31253, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5278-2911","authenticated-orcid":false,"given":"Seungwoo","family":"Kang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Korea University of Technology and Education, Cheonan 31253, Republic of Korea"}]}],"member":"1968","published-online":{"date-parts":[[2024,5,28]]},"reference":[{"key":"ref_1","unstructured":"(2024, April 05). Retouch-Phothos. Available online: https:\/\/play.google.com\/store\/apps\/details?id=royaln.Removeunwantedcontent."},{"key":"ref_2","unstructured":"(2024, April 05). Spectre Camera. Available online: https:\/\/spectre.cam\/."},{"key":"ref_3","unstructured":"(2024, April 05). Samsung Object Eraser. Available online: https:\/\/www.samsung.com\/latin_en\/support\/mobile-devices\/how-to-remove-unwanted-objects-from-photos-on-your-galaxy-phone\/."},{"key":"ref_4","unstructured":"Lee, J. (2021). Deep Learning Based Human Removal and Background Synthesis Application. [Master\u2019s Thesis, Korea University of Technology and Education]."},{"key":"ref_5","unstructured":"Pitaksarit, S. (2016). Diminished Reality Based on Texture Reprojection of Backgrounds, Segmented with Deep Learning. [Master\u2019s Thesis, Nara Institute of Science and Technology]."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"101693","DOI":"10.1016\/j.media.2020.101693","article-title":"Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation","volume":"63","author":"Tajbakhsh","year":"2020","journal-title":"Med. Image Anal."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"102028","DOI":"10.1016\/j.displa.2021.102028","article-title":"Image inpainting based on deep learning: A review","volume":"69","author":"Qin","year":"2021","journal-title":"Displays"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"109046","DOI":"10.1016\/j.patcog.2022.109046","article-title":"Deep learning for image inpainting: A survey","volume":"134","author":"Xiang","year":"2023","journal-title":"Pattern Recognit."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"103147","DOI":"10.1016\/j.cviu.2020.103147","article-title":"A comprehensive review of past and present image inpainting methods","volume":"203","author":"Jam","year":"2021","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_10","unstructured":"Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., and Hochreiter, S. (2017, January 4\u20139). Gans trained by a two time-scale update rule converge to a local nash equilibrium. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_11","unstructured":"Shetty, R.R., Fritz, M., and Schiele, B. (, January 3\u20138). Adversarial scene editing: Automatic object removal from weak supervision. Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montr\u00e9al, QC, Canada."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Dhamo, H., Farshad, A., Laina, I., Navab, N., Hager, G.D., Tombari, F., and Rupprecht, C. (2020, January 14\u201319). Semantic image manipulation using scene graphs. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Virtual.","DOI":"10.1109\/CVPR42600.2020.00526"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"44276","DOI":"10.1109\/ACCESS.2020.2977386","article-title":"A novel GAN-based network for unmasking of masked face","volume":"8","author":"Din","year":"2020","journal-title":"IEEE Access"},{"key":"ref_14","unstructured":"Hosen, M., and Islam, M. (2022). HiMFR: A Hybrid Masked Face Recognition Through Face Inpainting. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Sola, S., and Gera, D. (2023, January 17\u201324). Unmasking Your Expression: Expression-Conditioned GAN for Masked Face Inpainting. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPRW59228.2023.00628"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"Imagenet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_17","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Liu, Z., Luo, P., Wang, X., and Tang, X. (2015, January 7\u201313). Deep learning face attributes in the wild. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.425"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1452","DOI":"10.1109\/TPAMI.2017.2723009","article-title":"Places: A 10 million image database for scene recognition","volume":"40","author":"Zhou","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_22","first-page":"12077","article-title":"SegFormer: Simple and efficient design for semantic segmentation with transformers","volume":"34","author":"Xie","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Cheng, B., Misra, I., Schwing, A., Kirillov, A., and Girdhar, R. (2022, January 18\u201324). Masked-attention mask transformer for universal image segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00135"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Li, F., Zhang, H., Xu, H., Liu, S., Zhang, L., Ni, L., and Shum, H. (2023, January 17\u201324). Mask dino: Towards a unified transformer-based framework for object detection and segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00297"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Jain, J., Li, J., Chiu, M., Hassani, A., Orlov, N., and Shi, H. (2023, January 17\u201324). Oneformer: One trans-former to rule universal image segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00292"},{"key":"ref_26","first-page":"6030","article-title":"MedSegDiff-V2: Diffusion-Based Medical Image Segmentation with Transformer","volume":"38","author":"Wu","year":"2024","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_27","first-page":"819","article-title":"Learning content-enhanced mask transformer for domain generalized urban-scene segmentation","volume":"38","author":"Bi","year":"2024","journal-title":"AAAI Conf. Artif. Intell."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1080\/10867651.2004.10487596","article-title":"An image inpainting technique based on the fast marching method","volume":"9","author":"Telea","year":"2004","journal-title":"J. Graph. Tools"},{"key":"ref_29","unstructured":"Bertalmio, M., Bertozzi, A.L., and Sapiro, G. (2001, January 8\u201314). Navier-stokes, fluid dynamics, and image and video inpainting. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, USA."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1145\/3422622","article-title":"Generative adversarial networks","volume":"63","author":"Goodfellow","year":"2020","journal-title":"Commun. ACM"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., and Huang, T.S. (2018, January 18\u201323). Generative image inpainting with contextual attention. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00577"},{"key":"ref_32","unstructured":"Nazeri, K., Ng, E., Joseph, T., Qureshi, F.Z., and Ebrahimi, M. (2019). Edgeconnect: Generative image inpainting with adversarial edge learning. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zheng, H., Lin, Z., Lu, J., Cohen, S., Shechtman, E., Barnes, C., Zhang, J., Xu, N., Amirghodsi, S., and Luo, J. (2022). CM-GAN: Image Inpainting with Cascaded Modulation GAN and Object-Aware Training. arXiv.","DOI":"10.1007\/978-3-031-19787-1_16"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Li, W., Lin, Z., Zhou, K., Qi, L., Wang, Y., and Jia, J. (2022, January 18\u201324). Mat: Mask-aware transformer for large hole image inpainting. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01049"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Shamsolmoali, P., Zareapoor, M., and Granger, E. (2023, January 1\u20136). TransInpaint: Trans-former-based Image Inpainting with Context Adaptation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCVW60793.2023.00092"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Ko, K., and Kim, C. (2023, January 2\u20133). Continuously masked transformer for image inpainting. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.01211"},{"key":"ref_37","first-page":"6021","article-title":"SyFormer: Structure-Guided Syner-gism Transformer for Large-Portion Image Inpainting","volume":"38","author":"Wu","year":"2024","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_38","first-page":"6180","article-title":"WaveFormer: Wavelet Transformer for Noise-Robust Video Inpainting","volume":"38","author":"Wu","year":"2024","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Rublee, E., Rabaud, V., Konolige, K., and Bradski, G. (2011, January 6\u201313). ORB: An efficient alternative to SIFT or SURF. Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126544"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Bay, H., Tuytelaars, T., and Van Gool, L. (2006, January 7\u201313). Surf: Speeded up robust features. Proceedings of the European Conference on Computer Vision, Graz, Austria.","DOI":"10.1007\/11744023_32"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Adjabi, I., Ouahabi, A., Benzaoui, A., and Taleb-Ahmed, A. (2020). Past, present, and future of face recognition: A review. Electronics, 9.","DOI":"10.20944\/preprints202007.0479.v1"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"252","DOI":"10.1109\/TNNLS.2020.2978501","article-title":"Pepsi++: Fast and lightweight network for image inpainting","volume":"32","author":"Shin","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1815","DOI":"10.1007\/s13042-023-01999-z","article-title":"GCAM: Lightweight image inpainting via group convolution and attention mechanism","volume":"15","author":"Chen","year":"2024","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Drolia, U., Guo, K., Tan, J., Gandhi, R., and Narasimhan, P. (2017, January 5\u20138). Cachier: Edge-caching for recognition applications. Proceedings of the 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA.","DOI":"10.1109\/ICDCS.2017.94"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/11\/3486\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:49:42Z","timestamp":1760107782000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/11\/3486"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,28]]},"references-count":45,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2024,6]]}},"alternative-id":["s24113486"],"URL":"https:\/\/doi.org\/10.3390\/s24113486","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2024,5,28]]}}}