{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,18]],"date-time":"2026-04-18T01:15:44Z","timestamp":1776474944514,"version":"3.51.2"},"reference-count":46,"publisher":"Wiley","issue":"2","license":[{"start":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T00:00:00Z","timestamp":1769644800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"},{"start":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T00:00:00Z","timestamp":1769644800000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/doi.wiley.com\/10.1002\/tdm_license_1.1"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Int J Imaging Syst Tech"],"published-print":{"date-parts":[[2026,3]]},"abstract":"<jats:title>ABSTRACT<\/jats:title>\n                  <jats:p>In recent years, transformer\u2010based methods have achieved remarkable progress in medical image segmentation due to their superior ability to capture long\u2010range dependencies. However, these methods typically suffer from two major limitations. First, their computational complexity scales quadratically with the input sequences. Second, the feed\u2010forward network (FFN) modules in vanilla Transformers typically rely on fully connected layers, which limits models' ability to capture local contextual information and multiscale features critical for precise semantic segmentation. To address these issues, we propose an efficient medical image segmentation network, named TCSAFormer. The proposed TCSAFormer adopts two key ideas. First, it incorporates a Compressed Attention (CA) module, which combines token compression and pixel\u2010level sparse attention to dynamically focus on the most relevant key\u2010value pairs for each query. This is achieved by pruning globally irrelevant tokens and merging redundant ones, significantly reducing computational complexity while enhancing the model's ability to capture relationships between tokens. Second, it introduces a Dual\u2010Branch Feed\u2010Forward Network (DBFFN) module as a replacement for the standard FFN to capture local contextual features and multiscale information, thereby strengthening the model's feature representation capability. We conduct extensive experiments on four publicly available medical image segmentation datasets: ISIC\u20102018, CVC\u2010ClinicDB, Synapse and Abdomen MRI, to evaluate the segmentation performance of TCSAFormer. Experimental results demonstrate that TCSAFormer achieves superior performance compared to existing state\u2010of\u2010the\u2010art (SOTA) methods, while maintaining lower computational overhead, thus achieving an optimal trade\u2010off between efficiency and accuracy. The code is available on GitHub.<\/jats:p>","DOI":"10.1002\/ima.70302","type":"journal-article","created":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T17:51:26Z","timestamp":1769709086000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["<scp>TCSAFormer<\/scp>\n                    : Efficient Vision Transformer With Token Compression and Sparse Attention for Medical Image Segmentation"],"prefix":"10.1002","volume":"36","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-6706-5817","authenticated-orcid":false,"given":"Zunhui","family":"Xia","sequence":"first","affiliation":[{"name":"College of Computer Science and Engineering Chongqing University of Technology  Chongqing China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-7958-3976","authenticated-orcid":false,"given":"Hongxing","family":"Li","sequence":"additional","affiliation":[{"name":"College of Computer Science and Engineering Chongqing University of Technology  Chongqing China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4754-813X","authenticated-orcid":false,"given":"Libin","family":"Lan","sequence":"additional","affiliation":[{"name":"College of Computer Science and Engineering Chongqing University of Technology  Chongqing China"}]}],"member":"311","published-online":{"date-parts":[[2026,1,29]]},"reference":[{"key":"e_1_2_9_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2023.103000"},{"key":"e_1_2_9_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2023.110987"},{"key":"e_1_2_9_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2023.106626"},{"key":"e_1_2_9_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"e_1_2_9_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-00889-5_1"},{"key":"e_1_2_9_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP40776.2020.9053405"},{"key":"e_1_2_9_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2020.2983721"},{"key":"e_1_2_9_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2019.2903562"},{"key":"e_1_2_9_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.660"},{"key":"e_1_2_9_11_1","first-page":"2102.04306","article-title":"Transunet: Transformers Make Strong Encoders for Medical Image Segmentation","author":"Chen J.","year":"2021","journal-title":"arXiv"},{"key":"e_1_2_9_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-87193-2_2"},{"key":"e_1_2_9_13_1","first-page":"205","volume-title":"European Conference on Computer Vision","author":"Cao H.","year":"2022"},{"key":"e_1_2_9_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-87193-2_4"},{"key":"e_1_2_9_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV56688.2023.00614"},{"key":"e_1_2_9_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-96-2064-7_18"},{"key":"e_1_2_9_17_1","first-page":"2401.00722","article-title":"Brau\u2010Net++: U\u2010Shaped Hybrid Cnn\u2010Transformer Network for Medical Image Segmentation","author":"Lan L.","year":"2024","journal-title":"arXiv"},{"key":"e_1_2_9_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00995"},{"key":"e_1_2_9_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2022.3230943"},{"key":"e_1_2_9_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIM.2022.3178991"},{"key":"e_1_2_9_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.bspc.2024.107062"},{"key":"e_1_2_9_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/JBHI.2021.3138024"},{"key":"e_1_2_9_23_1","volume-title":"Token Merging: Your Vit But Faster","author":"Bolya D.","year":"2023"},{"key":"e_1_2_9_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.01493"},{"key":"e_1_2_9_25_1","first-page":"13937","article-title":"Dynamicvit: Efficient Vision Transformers With Dynamic Token Sparsification","volume":"34","author":"Rao Y.","year":"2021","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_9_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01185"},{"key":"e_1_2_9_27_1","first-page":"285","volume-title":"Kvt: K\u2010Nn Attention for Boosting Vision Transformers, in: European Conference on Computer Vision","author":"Wang P.","year":"2022"},{"key":"e_1_2_9_28_1","first-page":"1912.11637","article-title":"Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection","author":"Zhao G.","year":"2019","journal-title":"arXiv"},{"key":"e_1_2_9_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01199"},{"key":"e_1_2_9_30_1","first-page":"30772","article-title":"Accelerating Transformers With Spectrum\u2010Preserving Token Merging","volume":"37","author":"Tran C.","year":"2024","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_9_31_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41592-020-01008-z"},{"key":"e_1_2_9_32_1","first-page":"2401.04722","article-title":"U\u2010Mamba: Enhancing Long\u2010Range Dependency for Biomedical Image Segmentation","author":"Ma J.","year":"2024","journal-title":"arXiv"},{"key":"e_1_2_9_33_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v39i5.32491"},{"key":"e_1_2_9_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICME55011.2023.00391"},{"key":"e_1_2_9_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP43922.2022.9746172"},{"key":"e_1_2_9_36_1","first-page":"12077","article-title":"Segformer: Simple and Efficient Design for Semantic Segmentation With Transformers","volume":"34","author":"Xie E.","year":"2021","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_9_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/s41095-022-0274-8"},{"key":"e_1_2_9_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/TAI.2023.3326795"},{"key":"e_1_2_9_39_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2024.112050"},{"key":"e_1_2_9_40_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.compmedimag.2015.02.007"},{"key":"e_1_2_9_41_1","first-page":"1902.03368","article-title":"Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (Isic)","author":"Codella N.","year":"2019","journal-title":"arXiv"},{"key":"e_1_2_9_42_1","doi-asserted-by":"publisher","DOI":"10.1038\/sdata.2018.161"},{"key":"e_1_2_9_43_1","unstructured":"B.Landman Z.Xu J.Igelsias M.Styner T.Langerak andA.Klein Miccai Multi\u2010Atlas Labeling Beyond the Cranial Vault\u2010Workshop and Challenge (2015) 2 2015."},{"key":"e_1_2_9_44_1","first-page":"36722","article-title":"Amos: A Large\u2010Scale Abdominal Multi\u2010Organ Benchmark for Versatile Medical Image Segmentation","volume":"35","author":"Ji Y.","year":"2022","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_9_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.74"},{"key":"e_1_2_9_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"e_1_2_9_47_1","first-page":"2010.11929","article-title":"An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale","author":"Dosovitskiy A.","year":"2020","journal-title":"arXiv"}],"container-title":["International Journal of Imaging Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/ima.70302","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/ima.70302","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/ima.70302","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T17:11:59Z","timestamp":1774545119000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/ima.70302"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,29]]},"references-count":46,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,3]]}},"alternative-id":["10.1002\/ima.70302"],"URL":"https:\/\/doi.org\/10.1002\/ima.70302","archive":["Portico"],"relation":{},"ISSN":["0899-9457","1098-1098"],"issn-type":[{"value":"0899-9457","type":"print"},{"value":"1098-1098","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,29]]},"assertion":[{"value":"2025-05-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-01-20","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-01-29","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"e70302"}}