{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T13:58:58Z","timestamp":1776866338250,"version":"3.51.2"},"reference-count":32,"publisher":"Institution of Engineering and Technology (IET)","issue":"7","license":[{"start":{"date-parts":[[2023,8,29]],"date-time":"2023-08-29T00:00:00Z","timestamp":1693267200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":["ietresearch.onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["IET Computer Vision"],"published-print":{"date-parts":[[2023,10]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Multiple events in a long untrimmed video possess the characteristics of similarity and continuity. These characteristics can be considered as a kind of topic semantic information, which probably behaves as same sports, similar scenes, same objects etc. Inspired by this, a novel latent topic\u2010aware network (LTNet) is proposed in this article. The LTNet explores potential themes within videos and generates more continuous captions. Firstly, a global visual topic finder is employed to detect the similarity among events and obtain latent topic\u2010level features. Secondly, a latent topic\u2010oriented relation learner is designed to further enhance the topic\u2010level representations by capturing the relationship between each event and the video themes. Benefiting from the finder and the learner, the caption generator is capable of predicting more accurate and coherent descriptions. The effectiveness of our proposed method is demonstrated on ActivityNet Captions and YouCook2 datasets, where LTNet shows a relative performance of over 3.03% and 0.50% in CIDEr score respectively.<\/jats:p>","DOI":"10.1049\/cvi2.12195","type":"journal-article","created":{"date-parts":[[2023,8,29]],"date-time":"2023-08-29T03:04:53Z","timestamp":1693278293000},"page":"795-803","update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["A latent topic\u2010aware network for dense video captioning"],"prefix":"10.1049","volume":"17","author":[{"given":"Tao","family":"Xu","sequence":"first","affiliation":[{"name":"The College of Computer Science and Technology Civil Aviation University of China  Tianjin China"},{"name":"Key Laboratory of Intelligent Airport Theory and Systems CAAC  Tianjin China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6626-4509","authenticated-orcid":false,"given":"Yuanyuan","family":"Cui","sequence":"additional","affiliation":[{"name":"The College of Computer Science and Technology Civil Aviation University of China  Tianjin China"},{"name":"Key Laboratory of Intelligent Airport Theory and Systems CAAC  Tianjin China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9540-2093","authenticated-orcid":false,"given":"Xinyu","family":"He","sequence":"additional","affiliation":[{"name":"The College of Computer Science and Technology Civil Aviation University of China  Tianjin China"},{"name":"Key Laboratory of Intelligent Airport Theory and Systems CAAC  Tianjin China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9108-0357","authenticated-orcid":false,"given":"Caihua","family":"Liu","sequence":"additional","affiliation":[{"name":"The College of Computer Science and Technology Civil Aviation University of China  Tianjin China"},{"name":"Key Laboratory of Intelligent Airport Theory and Systems CAAC  Tianjin China"}]}],"member":"265","published-online":{"date-parts":[[2023,8,29]]},"reference":[{"key":"e_1_2_10_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.83"},{"key":"e_1_2_10_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00782"},{"key":"e_1_2_10_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00675"},{"key":"e_1_2_10_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00751"},{"key":"e_1_2_10_6_1","first-page":"6847","volume-title":"IEEE International Conference on Computer Vision (ICCV)","author":"Wang T.","year":"2021"},{"key":"e_1_2_10_7_1","first-page":"8739","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Zhou L.","year":"2018"},{"key":"e_1_2_10_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP.2018.8451740"},{"key":"e_1_2_10_9_1","first-page":"958","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Vladimir I.","year":"2020"},{"key":"e_1_2_10_10_1","first-page":"151","volume-title":"IEEE Winter Conference on Applications of Computer Vision (WACV)","author":"Vladimir I.","year":"2019"},{"key":"e_1_2_10_11_1","first-page":"2004","volume-title":"Association for Computational Linguistics (ACL)","author":"Ji L.","year":"2021"},{"key":"e_1_2_10_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/tcsvt.2020.3014606"},{"key":"e_1_2_10_13_1","unstructured":"Wang T. Zheng H. Yu M.:Dense\u2010captioning Events in Videos: Sysu Submission to Activitynet Challenge 2020(2020). arXiv preprint arXiv:2006.11693"},{"key":"e_1_2_10_14_1","first-page":"1799","volume-title":"IEEE Transactions on Multimedia (TMM)","author":"Zhang Z.","year":"2020"},{"key":"e_1_2_10_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00900"},{"key":"e_1_2_10_16_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/K19-1039"},{"key":"e_1_2_10_17_1","first-page":"6382","volume-title":"Dense Procedure Captioning in Narrated Instructional Videos","author":"Shi B.","year":"2019"},{"key":"e_1_2_10_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2019.00048"},{"key":"e_1_2_10_19_1","unstructured":"Zhu W. et\u00a0al.:End\u2010to\u2010end Dense Video Captioning as Sequence Generation(2022). arXiv preprint arXiv:2204.08121"},{"key":"e_1_2_10_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2019.107075"},{"key":"e_1_2_10_21_1","volume-title":"IEEE Transactions on Neural Networks and Learning Systems (TNNLS)","author":"Shao Z.","year":"2022"},{"key":"e_1_2_10_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.510"},{"key":"e_1_2_10_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/tpami.2018.2868668"},{"key":"e_1_2_10_24_1","volume-title":"Conference on Neural Information Processing Systems (NeurIPS)","author":"Vaswani A.","year":"2017"},{"key":"e_1_2_10_25_1","volume-title":"International Conference on Learning Representations (ICLR)","author":"Zhu X.","year":"2021"},{"key":"e_1_2_10_26_1","first-page":"213","volume-title":"European Conference on Computer Vision (ECCV)","author":"Carion N.","year":"2020"},{"key":"e_1_2_10_27_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.12342"},{"key":"e_1_2_10_28_1","first-page":"65","volume-title":"Meteor: An Automatic Metric for Mt Evaluation with Improved Correlation with Human Judgments","author":"Banerjee S.","year":"2005"},{"key":"e_1_2_10_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00854"},{"key":"e_1_2_10_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299087"},{"key":"e_1_2_10_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58539-6_31"},{"key":"e_1_2_10_32_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i07.6881"},{"key":"e_1_2_10_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01252-6_29"}],"container-title":["IET Computer Vision"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/ietresearch.onlinelibrary.wiley.com\/doi\/pdf\/10.1049\/cvi2.12195","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T10:48:20Z","timestamp":1761562100000},"score":1,"resource":{"primary":{"URL":"https:\/\/ietresearch.onlinelibrary.wiley.com\/doi\/10.1049\/cvi2.12195"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,29]]},"references-count":32,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2023,10]]}},"alternative-id":["10.1049\/cvi2.12195"],"URL":"https:\/\/doi.org\/10.1049\/cvi2.12195","archive":["Portico"],"relation":{},"ISSN":["1751-9632","1751-9640"],"issn-type":[{"value":"1751-9632","type":"print"},{"value":"1751-9640","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,29]]},"assertion":[{"value":"2022-10-19","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-03-24","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-08-29","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}