{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,28]],"date-time":"2026-02-28T02:51:56Z","timestamp":1772247116452,"version":"3.50.1"},"reference-count":52,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,2,9]],"date-time":"2023-02-09T00:00:00Z","timestamp":1675900800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the German Federal Ministry of Research and Education (BMBF)","award":["CoHMed\/IntelliMed 13FH5I05IA"],"award-info":[{"award-number":["CoHMed\/IntelliMed 13FH5I05IA"]}]},{"name":"the German Federal Ministry of Research and Education (BMBF)","award":["CoHMed\/PersonaMed B- 3FH5I09IA"],"award-info":[{"award-number":["CoHMed\/PersonaMed B- 3FH5I09IA"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Adapting intelligent context-aware systems (CAS) to future operating rooms (OR) aims to improve situational awareness and provide surgical decision support systems to medical teams. CAS analyzes data streams from available devices during surgery and communicates real-time knowledge to clinicians. Indeed, recent advances in computer vision and machine learning, particularly deep learning, paved the way for extensive research to develop CAS. In this work, a deep learning approach for analyzing laparoscopic videos for surgical phase recognition, tool classification, and weakly-supervised tool localization in laparoscopic videos was proposed. The ResNet-50 convolutional neural network (CNN) architecture was adapted by adding attention modules and fusing features from multiple stages to generate better-focused, generalized, and well-representative features. Then, a multi-map convolutional layer followed by tool-wise and spatial pooling operations was utilized to perform tool localization and generate tool presence confidences. Finally, the long short-term memory (LSTM) network was employed to model temporal information and perform tool classification and phase recognition. The proposed approach was evaluated on the Cholec80 dataset. The experimental results (i.e., 88.5% and 89.0% mean precision and recall for phase recognition, respectively, 95.6% mean average precision for tool presence detection, and a 70.1% F1-score for tool localization) demonstrated the ability of the model to learn discriminative features for all tasks. The performances revealed the importance of integrating attention modules and multi-stage feature fusion for more robust and precise detection of surgical phases and tools.<\/jats:p>","DOI":"10.3390\/s23041958","type":"journal-article","created":{"date-parts":[[2023,2,10]],"date-time":"2023-02-10T02:09:59Z","timestamp":1675994999000},"page":"1958","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["Laparoscopic Video Analysis Using Temporal, Attention, and Multi-Feature Fusion Based-Approaches"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0209-3389","authenticated-orcid":false,"given":"Nour Aldeen","family":"Jalal","sequence":"first","affiliation":[{"name":"Institute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, Germany"},{"name":"Innovation Center Computer Assisted Surgery (ICCAS), University of Leipzig, 04103 Leipzig, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7436-0338","authenticated-orcid":false,"given":"Tamer Abdulbaki","family":"Alshirbaji","sequence":"additional","affiliation":[{"name":"Institute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, Germany"},{"name":"Innovation Center Computer Assisted Surgery (ICCAS), University of Leipzig, 04103 Leipzig, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1661-2573","authenticated-orcid":false,"given":"Paul David","family":"Docherty","sequence":"additional","affiliation":[{"name":"Institute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, Germany"},{"name":"Department of Mechanical Engineering, University of Canterbury, Christchurch 8041, New Zealand"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1492-1121","authenticated-orcid":false,"given":"Herag","family":"Arabian","sequence":"additional","affiliation":[{"name":"Institute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3059-8894","authenticated-orcid":false,"given":"Bernhard","family":"Laufer","sequence":"additional","affiliation":[{"name":"Institute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6059-2704","authenticated-orcid":false,"given":"Sabine","family":"Krueger-Ziolek","sequence":"additional","affiliation":[{"name":"Institute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6999-5024","authenticated-orcid":false,"given":"Thomas","family":"Neumuth","sequence":"additional","affiliation":[{"name":"Innovation Center Computer Assisted Surgery (ICCAS), University of Leipzig, 04103 Leipzig, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4709-3817","authenticated-orcid":false,"given":"Knut","family":"Moeller","sequence":"additional","affiliation":[{"name":"Institute of Technical Medicine (ITeM), Furtwangen University, 78054 Villingen-Schwenningen, Germany"},{"name":"Department of Mechanical Engineering, University of Canterbury, Christchurch 8041, New Zealand"},{"name":"Department of Microsystems Engineering, University of Freiburg, 79110 Freiburg, Germany"}]}],"member":"1968","published-online":{"date-parts":[[2023,2,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"691","DOI":"10.1038\/s41551-017-0132-7","article-title":"Surgical Data Science for Next-Generation Interventions","volume":"1","author":"Vedula","year":"2017","journal-title":"Nat. Biomed. Eng."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"102306","DOI":"10.1016\/j.media.2021.102306","article-title":"Surgical Data Science\u2013from Concepts toward Clinical Translation","volume":"76","author":"Eisenmann","year":"2022","journal-title":"Med. Image Anal."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"450","DOI":"10.1159\/000511351","article-title":"Artificial intelligence-assisted surgery: Potential and challenges","volume":"36","author":"Bodenstedt","year":"2020","journal-title":"Visc. Med."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"104240","DOI":"10.1016\/j.jbi.2022.104240","article-title":"Ontology-based surgical workflow recognition and prediction","volume":"136","author":"Neumann","year":"2022","journal-title":"J. Biomed. Inform."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"500","DOI":"10.1515\/cdbme-2021-2127","article-title":"Changes of Physiological parameters of the patient during laparoscopic gynaecology","volume":"7","author":"Jalal","year":"2021","journal-title":"Curr. Dir. Biomed. Eng."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Jalal, N.A., Alshirbaji, T.A., Laufer, B., Docherty, P.D., Russo, S.G., Neumuth, T., and M\u00f6ller, K. (2021, January 1\u20135). Effects of Intra-Abdominal Pressure on Lung Mechanics during Laparoscopic Gynaecology. Proceedings of the 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Virtual, Mexico.","DOI":"10.1109\/EMBC46164.2021.9630753"},{"key":"ref_7","first-page":"123","article-title":"Surgical Process Modeling","volume":"2","author":"Neumuth","year":"2017","journal-title":"Innov. Surg. Sci."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1007\/s11548-013-0940-5","article-title":"Surgical Process Modelling: A Review","volume":"9","author":"Lalys","year":"2014","journal-title":"Int. J. CARS"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1089","DOI":"10.1007\/s11548-019-01966-6","article-title":"Prediction of laparoscopic procedure duration using unlabeled, multimodal sensor data","volume":"14","author":"Bodenstedt","year":"2019","journal-title":"Int. J. Comput. Assist. Radiol. Surg."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1081","DOI":"10.1007\/s11548-016-1371-x","article-title":"Automatic Data-Driven Real-Time Segmentation and Recognition of Surgical Workflow","volume":"11","author":"Dergachyova","year":"2016","journal-title":"Int. J. Comput. Assist. Radiol. Surg."},{"key":"ref_11","unstructured":"Lalys, F., Riffaud, L., Morandi, X., and Jannin, P. (2010). Medical Computer Vision. Recognition Techniques and Applications in Medical Imaging, Proceedings of the International MICCAI Workshop, MCV 2010, Beijing, China, 20 September 2010, Revised Selected Papers 1, Springer."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1515\/cdbme-2019-0010","article-title":"Surface emg-based surgical instrument classification for dynamic activity recognition in surgical workflows","volume":"5","author":"Bieck","year":"2019","journal-title":"Curr. Dir. Biomed. Eng."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Blum, T., Padoy, N., Feu\u00dfner, H., and Navab, N. (2008, January 6\u201310). Modeling and online recognition of surgical phases using hidden markov models. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, New York, NY, USA.","DOI":"10.1007\/978-3-540-85990-1_75"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"198","DOI":"10.3109\/13645706.2013.878363","article-title":"Sensor-Based Surgical Activity Recognition in Unconstrained Environments","volume":"23","author":"Meixensberger","year":"2014","journal-title":"Minim. Invasive Ther. Allied Technol."},{"key":"ref_15","first-page":"689","article-title":"RFID-based surgical instrument detection using Hidden Markov models","volume":"57","author":"Neumuth","year":"2012","journal-title":"Biomed. Eng. Tech."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1201","DOI":"10.1007\/s11548-016-1409-0","article-title":"System events: Readily accessible features for surgical phase detection","volume":"11","author":"Malpani","year":"2016","journal-title":"Int. J. Comput. Assist. Radiol. Surg."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"684","DOI":"10.1097\/SLA.0000000000004425","article-title":"Machine Learning for Surgical Phase Recognition: A Systematic Review","volume":"273","author":"Garrow","year":"2021","journal-title":"Ann. Surg."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2603","DOI":"10.1109\/TMI.2015.2450831","article-title":"Detecting surgical tools by modelling local appearance and global shape","volume":"34","author":"Bouget","year":"2015","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_19","unstructured":"Bodenstedt, S., Ohnemus, A., Katic, D., Wekerle, A.L., Wagner, M., Kenngott, H., M\u00fcller-Stich, B., Dillmann, R., and Speidel, S. (2018). Real-time image-based instrument classification for laparoscopic surgery. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1080\/13645706.2019.1584116","article-title":"Machine and Deep Learning for Workflow Recognition during Surgery","volume":"28","author":"Padoy","year":"2019","journal-title":"Minim. Invasive Ther. Allied Technol."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Durand, T., Mordan, T., Thome, N., and Cord, M. (2017, January 21\u201326). Wildcat: Weakly supervised learning of deep convnets for image classification, pointwise localization and segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.631"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1109\/TMI.2016.2593957","article-title":"EndoNet: A deep architecture for recognition tasks on laparoscopic videos","volume":"36","author":"Twinanda","year":"2017","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_24","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1515\/cdbme-2018-0099","article-title":"Evaluating convolutional neural network and hidden markov model for recognising surgical phases in sigmoid resection","volume":"4","author":"Jalal","year":"2018","journal-title":"Curr. Dir. Biomed. Eng."},{"key":"ref_26","unstructured":"Twinanda, A.P. (2017). Vision-Based Approaches for Surgical Activity Recognition Using Laparoscopic and RBGD Videos. [Ph.D. Thesis, Strasbourg University]."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"101572","DOI":"10.1016\/j.media.2019.101572","article-title":"Multi-Task Recurrent Convolutional Network with Correlation Loss for Surgical Video Analysis","volume":"59","author":"Jin","year":"2020","journal-title":"Med. Image Anal."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"405","DOI":"10.1515\/cdbme-2019-0102","article-title":"Predicting surgical phases using CNN-NARX neural network","volume":"5","author":"Jalal","year":"2019","journal-title":"Curr. Dir. Biomed. Eng."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1114","DOI":"10.1109\/TMI.2017.2787657","article-title":"SV-RCNet: Workflow recognition from surgical videos using recurrent convolutional network","volume":"37","author":"Jin","year":"2017","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Czempiel, T., Paschali, M., Keicher, M., Simson, W., Feussner, H., Kim, S.T., and Navab, N. (2020, January 4\u20138). Tecno: Surgical phase recognition with multi-stage temporal convolutional networks. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Lima, Peru.","DOI":"10.1007\/978-3-030-59716-0_33"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"He, K., Gan, C., Li, Z., Rekik, I., Yin, Z., Ji, W., Gao, Y., Wang, Q., Zhang, J., and Shen, D. (2022). Transformers in medical image analysis: A review. arXiv.","DOI":"10.1016\/j.imed.2022.07.002"},{"key":"ref_32","unstructured":"Czempiel, T., Paschali, M., Ostler, D., Kim, S.T., Busam, B., and Navab, N. (October, January 27). Opera: Attention-regularized transformers for surgical phase recognition. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, France."},{"key":"ref_33","unstructured":"Gao, X., Jin, Y., Long, Y., Dou, Q., and Heng, P.A. (October, January 27). Trans-svnet: Accurate phase recognition from surgical videos via hybrid embedding aggregation transformer. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, France."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1515\/cdbme-2018-0097","article-title":"Surgical tool classification in laparoscopic videos using convolutional neural network","volume":"4","author":"Jalal","year":"2018","journal-title":"Curr. Dir. Biomed. Eng."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"102801","DOI":"10.1016\/j.bspc.2021.102801","article-title":"A Deep Learning Spatial-Temporal Framework for Detecting Surgical Tools in Laparoscopic Videos","volume":"68","author":"Jalal","year":"2021","journal-title":"Biomed. Signal Process. Control"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"20200002","DOI":"10.1515\/cdbme-2020-0002","article-title":"A Convolutional Neural Network with a Two-Stage LSTM Model for Tool Presence Detection in Laparoscopic Videos","volume":"6","author":"Alshirbaji","year":"2020","journal-title":"Curr. Dir. Biomed. Eng."},{"key":"ref_37","unstructured":"Jalal, N.A., Abdulbaki Alshirbaji, T., Docherty, P.D., Neumuth, T., and M\u00f6ller, K. (December, January 29). Surgical Tool Detection in Laparoscopic Videos by Modeling Temporal Dependencies Between Adjacent Frames. Proceedings of the European Medical and Biological Engineering Conference, Portoro\u017e, Slovenia."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Wang, S., Xu, Z., Yan, C., and Huang, J. (2019, January 2\u20137). Graph convolutional nets for tool presence detection in surgical videos. Proceedings of the International Conference on Information Processing in Medical Imaging, Hong Kong, China.","DOI":"10.1007\/978-3-030-20351-1_36"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1059","DOI":"10.1007\/s11548-019-01958-6","article-title":"Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos","volume":"14","author":"Nwoye","year":"2019","journal-title":"Int. J. Comput. Assist. Radiol. Surg."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Pfeiffer, M., Funke, I., Robu, M.R., Bodenstedt, S., Strenger, L., Engelhardt, S., Ro\u00df, T., Clarkson, M.J., Gurusamy, K., and Davidson, B.R. (2019, January 13\u201317). Generating large labeled data sets for laparoscopic image processing tasks using unpaired image-to-image translation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Shenzhen, China.","DOI":"10.1007\/978-3-030-32254-0_14"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"594","DOI":"10.1016\/j.jss.2022.11.008","article-title":"Generating Rare Surgical Events Using CycleGAN: Addressing Lack of Data for Artificial Intelligence Event Recognition","volume":"283","author":"Mohamadipanah","year":"2023","journal-title":"J. Surg. Res."},{"key":"ref_42","unstructured":"Vardazaryan, A., Mutter, D., Marescaux, J., and Padoy, N. (2018). Intravascular Imaging and Computer Assisted Stenting and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis, Springer."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"228853","DOI":"10.1109\/ACCESS.2020.3046258","article-title":"Real-Time Surgical Tool Detection in Minimally Invasive Surgery Based on Attention-Guided Convolutional Neural Network","volume":"8","author":"Shi","year":"2020","journal-title":"IEEE Access"},{"key":"ref_44","unstructured":"Hu, X., Yu, L., Chen, H., Qin, J., and Heng, P.A. (2017). Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Springer."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"548","DOI":"10.1515\/cdbme-2022-1140","article-title":"Analysing Attention Convolutional Neural Network for Surgical Tool Localisation: A Feasibility Study","volume":"8","author":"Jalal","year":"2022","journal-title":"Curr. Dir. Biomed. Eng."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Abdulbaki Alshirbaji, T., Jalal, N.A., Docherty, P.D., Neumuth, T., and M\u00f6ller, K. (2022). Robustness of Convolutional Neural Networks for Surgical Tool Classification in Laparoscopic Videos from Multiple Sources and of Multiple Types: A Systematic Evaluation. Electronics, 11.","DOI":"10.3390\/electronics11182849"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201322). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"676","DOI":"10.1515\/cdbme-2022-1172","article-title":"Attention Networks for Improving Surgical Tool Classification in Laparoscopic Videos","volume":"8","author":"Arabian","year":"2022","journal-title":"Curr. Dir. Biomed. Eng."},{"key":"ref_49","unstructured":"Yim, J., Ju, J., Jung, H., and Kim, J. (2015). Robot Intelligence Technology and Applications 3, Springer."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Alshirbaji, T.A., Jalal, N.A., Docherty, P.D., Neumuth, P., and M\u00f6ller, K. (2022, January 11\u201315). Improving the Generalisability of Deep CNNs by Combining Multi-stage Features for Surgical Tool Classification. Proceedings of the 2022 44th Annual International Conference of the IEEE Engineering in Medicine &Biology Society (EMBC), Glasgow, Scotland, UK.","DOI":"10.1109\/EMBC48229.2022.9870883"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Smith, L.N. (2017, January 24\u201331). Cyclical learning rates for training neural networks. Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA.","DOI":"10.1109\/WACV.2017.58"},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"334","DOI":"10.1016\/j.ifacol.2021.10.278","article-title":"A Deep Learning Framework for Recognising Surgical Phases in Laparoscopic Videos","volume":"54","author":"Jalal","year":"2021","journal-title":"IFAC-PapersOnLine"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/4\/1958\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:29:09Z","timestamp":1760120949000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/4\/1958"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,9]]},"references-count":52,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["s23041958"],"URL":"https:\/\/doi.org\/10.3390\/s23041958","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,9]]}}}