{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T21:06:45Z","timestamp":1760044005935,"version":"3.41.0"},"reference-count":59,"publisher":"Association for Computing Machinery (ACM)","issue":"OOPSLA2","license":[{"start":{"date-parts":[[2022,10,31]],"date-time":"2022-10-31T00:00:00Z","timestamp":1667174400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Program. Lang."],"published-print":{"date-parts":[[2022,10,31]]},"abstract":"<jats:p>Many applications, from social network graph analytics to control flow analysis, compute on sparse data that evolves over the course of program execution.  \nSuch data can be represented as dynamic sparse tensors and efficiently stored in formats (data layouts) that utilize pointer-based data structures like block linked lists, binary search trees, B-trees, and C-trees among others.  \nThese specialized formats support fast in-place modification and are thus better suited than traditional, array-based data structures like CSR for storing dynamic sparse tensors.  \nHowever, different dynamic sparse tensor formats have distinct benefits and drawbacks, and performing different computations on tensors that are stored in different formats can require vastly dissimilar code that are not straightforward to correctly implement and optimize.<\/jats:p>\n          <jats:p>This paper shows how a compiler can generate efficient code to compute tensor algebra operations on dynamic sparse tensors that may be stored in a wide range of disparate formats.  \nWe propose a language for precisely specifying recursive, pointer-based data structures, and we show how this language can express many different dynamic data structures, including all the ones named above as well as many more.  \nWe then describe how, given high-level specifications of such dynamic data structures, a compiler can emit code to efficiently access and compute on dynamic sparse tensors that are stored in the aforementioned data structures.<\/jats:p>\n          <jats:p>We evaluate our technique and find it generates efficient dynamic sparse tensor algebra kernels that have performance comparable to, if not better than, state-of-the-art libraries and frameworks such as PAM, Aspen, STINGER, and Terrace.  \nAt the same time, our technique supports a wider range of tensor algebra operations---such as those that simultaneously compute with static and dynamic sparse tensors---than Aspen, STINGER, and Terrace, while also achieving significantly better performance than PAM for those same operations.<\/jats:p>","DOI":"10.1145\/3563338","type":"journal-article","created":{"date-parts":[[2022,10,31]],"date-time":"2022-10-31T20:23:35Z","timestamp":1667247815000},"page":"1408-1437","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Compilation of dynamic sparse tensor algebra"],"prefix":"10.1145","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5048-7131","authenticated-orcid":false,"given":"Stephen","family":"Chou","sequence":"first","affiliation":[{"name":"Massachusetts Institute of Technology, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7231-7643","authenticated-orcid":false,"given":"Saman","family":"Amarasinghe","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, USA"}]}],"member":"320","published-online":{"date-parts":[[2022,10,31]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2016.10.001"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1863543.1863581"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS47924.2020.00081"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2017.76"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPEC.2012.6408676"},{"volume-title":"Efficient Sparse Matrix-Vector Multiplication on CUDA","author":"Bell Nathan","key":"e_1_2_1_6_1","unstructured":"Nathan Bell and Michael Garland . 2008. Efficient Sparse Matrix-Vector Multiplication on CUDA . NVIDIA Corporation . Nathan Bell and Michael Garland. 2008. Efficient Sparse Matrix-Vector Multiplication on CUDA. NVIDIA Corporation."},{"volume-title":"Compiler Support for Sparse Matrix Computations. Ph. D. Dissertation","author":"Bik Aart JC","key":"e_1_2_1_7_1","unstructured":"Aart JC Bik . 1996. Compiler Support for Sparse Matrix Computations. Ph. D. Dissertation . Leiden University . Aart JC Bik. 1996. Compiler Support for Sparse Matrix Computations. Ph. D. Dissertation. Leiden University."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3544559"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/165939.166023"},{"volume-title":"Languages and Compilers for Parallel Computing","author":"Bik Aart JC","key":"e_1_2_1_10_1","unstructured":"Aart JC Bik and Harry AG Wijshoff . 1994. On automatic data structure selection and code generation for sparse computations . In Languages and Compilers for Parallel Computing . Springer , 57\u201375. Aart JC Bik and Harry AG Wijshoff. 1994. On automatic data structure selection and code generation for sparse computations. In Languages and Compilers for Parallel Computing. Springer, 57\u201375."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPEC.2018.8547541"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3276493"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3385412.3385963"},{"key":"e_1_2_1_14_1","volume-title":"Introduction to Algorithms","author":"Cormen Thomas H.","year":"2033","unstructured":"Thomas H. Cormen , Charles E. Leiserson , Ronald L. Rivest , and Clifford Stein . 2009. Introduction to Algorithms , Third Edition (3 rd ed.). The MIT Press . isbn:026 2033 844 Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms, Third Edition (3rd ed.). The MIT Press. isbn:0262033844","edition":"3"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2049662.2049663"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3314221.3314598"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPEC.2012.6408680"},{"key":"e_1_2_1_18_1","unstructured":"Ga\u00ebl Guennebaud and Beno\u00eet Jacob. 2010. Eigen v3. http:\/\/eigen.tuxfamily.org. \t\t\t\t  Ga\u00ebl Guennebaud and Beno\u00eet Jacob. 2010. Eigen v3. http:\/\/eigen.tuxfamily.org."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1993498.1993504"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3485505"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3293883.3295712"},{"key":"e_1_2_1_22_1","unstructured":"Intel. 2020. Intel oneAPI Math Kernel Library Developer Reference. https:\/\/www.intel.com\/content\/dam\/develop\/external\/us\/en\/documents\/onemkl-developerreference-c.pdf \t\t\t\t  Intel. 2020. Intel oneAPI Math Kernel Library Developer Reference. https:\/\/www.intel.com\/content\/dam\/develop\/external\/us\/en\/documents\/onemkl-developerreference-c.pdf"},{"volume-title":"Dynamic Sparse-Matrix Allocation on GPUs","author":"King James","key":"e_1_2_1_23_1","unstructured":"James King , Thomas Gilray , Robert M. Kirby , and Matthew Might . 2016. Dynamic Sparse-Matrix Allocation on GPUs . In High Performance Computing, Julian M. Kunkel, Pavan Balaji, and Jack Dongarra (Eds.). Springer International Publishing , Cham . 61\u201380. isbn:978-3-319-41321-1 James King, Thomas Gilray, Robert M. Kirby, and Matthew Might. 2016. Dynamic Sparse-Matrix Allocation on GPUs. In High Performance Computing, Julian M. Kunkel, Pavan Balaji, and Jack Dongarra (Eds.). Springer International Publishing, Cham. 61\u201380. isbn:978-3-319-41321-1"},{"key":"e_1_2_1_24_1","unstructured":"Fredrik Kjolstad Peter Ahrens Shoaib Kamil and Saman Amarasinghe. 2019. Tensor Algebra Compilation with Workspaces. 180\u2013192. isbn:978-1-7281-1436-1 http:\/\/dl.acm.org\/citation.cfm?id=3314872.3314894 \t\t\t\t  Fredrik Kjolstad Peter Ahrens Shoaib Kamil and Saman Amarasinghe. 2019. Tensor Algebra Compilation with Workspaces. 180\u2013192. isbn:978-1-7281-1436-1 http:\/\/dl.acm.org\/citation.cfm?id=3314872.3314894"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133901"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2509136.2509555"},{"volume-title":"Relational Algebraic Techniques for the Synthesis of Sparse Matrix Programs. Ph. D. Dissertation","author":"Kotlyar Vladimir","key":"e_1_2_1_27_1","unstructured":"Vladimir Kotlyar . 1999. Relational Algebraic Techniques for the Synthesis of Sparse Matrix Programs. Ph. D. Dissertation . Cornell University . Vladimir Kotlyar. 1999. Relational Algebraic Techniques for the Synthesis of Sparse Matrix Programs. Ph. D. Dissertation. Cornell University."},{"volume-title":"Euro-Par\u201997 Parallel Processing","author":"Kotlyar Vladimir","key":"e_1_2_1_28_1","unstructured":"Vladimir Kotlyar , Keshav Pingali , and Paul Stodghill . 1997. A relational approach to the compilation of sparse matrix programs . In Euro-Par\u201997 Parallel Processing . Springer , 318\u2013327. Vladimir Kotlyar, Keshav Pingali, and Paul Stodghill. 1997. A relational approach to the compilation of sparse matrix programs. In Euro-Par\u201997 Parallel Processing. Springer, 318\u2013327."},{"volume-title":"GraphOne: A Data Store for Real-time Analytics on Evolving Graphs. In 17th USENIX Conference on File and Storage Technologies (FAST 19)","author":"Kumar Pradeep","key":"e_1_2_1_29_1","unstructured":"Pradeep Kumar and H. Howie Huang . 2019 . GraphOne: A Data Store for Real-time Analytics on Evolving Graphs. In 17th USENIX Conference on File and Storage Technologies (FAST 19) . USENIX Association, Boston, MA. 249\u2013263. isbn:978-1-939133-09-0 https:\/\/www.usenix.org\/conference\/fast19\/presentation\/kumar Pradeep Kumar and H. Howie Huang. 2019. GraphOne: A Data Store for Real-time Analytics on Evolving Graphs. In 17th USENIX Conference on File and Storage Technologies (FAST 19). USENIX Association, Boston, MA. 249\u2013263. isbn:978-1-939133-09-0 https:\/\/www.usenix.org\/conference\/fast19\/presentation\/kumar"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2637365.2517225"},{"volume-title":"Proceedings of the 12th International Conference on Very Large Data Bases (VLDB \u201986)","author":"Tobin","key":"e_1_2_1_31_1","unstructured":"Tobin J. Lehman and Michael J. Carey. 1986. A Study of Index Structures for Main Memory Database Management Systems . In Proceedings of the 12th International Conference on Very Large Data Bases (VLDB \u201986) . Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. 294\u2013303. isbn:0934613184 Tobin J. Lehman and Michael J. Carey. 1986. A Study of Index Structures for Main Memory Database Management Systems. In Proceedings of the 12th International Conference on Very Large Data Bases (VLDB \u201986). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. 294\u2013303. isbn:0934613184"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2018.00022"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2751205.2751209"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2015.7113298"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPEC.2013.6670338"},{"volume-title":"Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures","author":"Monakov Alexander","key":"e_1_2_1_36_1","unstructured":"Alexander Monakov , Anton Lokhmotov , and Arutyun Avetisyan . 2010. Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures . In High Performance Embedded Architectures and Compilers, Yale N. Patt, Pierfrancesco Foglia, Evelyn Duesterwald, Paolo Faraboschi, and Xavier Martorell (Eds.). Springer Berlin Heidelberg , Berlin, Heidelberg . 111\u2013125. isbn:978-3-642-11515-8 Alexander Monakov, Anton Lokhmotov, and Arutyun Avetisyan. 2010. Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures. In High Performance Embedded Architectures and Compilers, Yale N. Patt, Pierfrancesco Foglia, Evelyn Duesterwald, Paolo Faraboschi, and Xavier Martorell (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 111\u2013125. isbn:978-3-642-11515-8"},{"volume-title":"Proceedings of the 7th International World Wide Web Conference","author":"Page L.","key":"e_1_2_1_37_1","unstructured":"L. Page , S. Brin , R. Motwani , and T. Winograd . 1998. The PageRank citation ranking: Bringing order to the Web . In Proceedings of the 7th International World Wide Web Conference . Brisbane, Australia. 161\u2013172. citeseer.nj.nec.com\/page98pagerank.html L. Page, S. Brin, R. Motwani, and T. Winograd. 1998. The PageRank citation ranking: Bringing order to the Web. In Proceedings of the 7th International World Wide Web Conference. Brisbane, Australia. 161\u2013172. citeseer.nj.nec.com\/page98pagerank.html"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457313"},{"key":"e_1_2_1_39_1","volume-title":"Hai Li, Yiran Chen, and Pradeep Dubey.","author":"Park Jongsoo","year":"2016","unstructured":"Jongsoo Park , Sheng Li , Wei Wen , Ping Tak Peter Tang , Hai Li, Yiran Chen, and Pradeep Dubey. 2016 . Faster CNNs with Direct Sparse Convolutions and Guided Pruning . arxiv:1608.01409. Jongsoo Park, Sheng Li, Wei Wen, Ping Tak Peter Tang, Hai Li, Yiran Chen, and Pradeep Dubey. 2016. Faster CNNs with Direct Sparse Convolutions and Guided Pruning. arxiv:1608.01409."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2908080.2908093"},{"key":"e_1_2_1_41_1","volume-title":"SIPR: A new framework for generating efficient code for sparse matrix computations. In Languages and Compilers for Parallel Computing","author":"Pugh William","year":"1999","unstructured":"William Pugh and Tatiana Shpeisman . 1999 . SIPR: A new framework for generating efficient code for sparse matrix computations. In Languages and Compilers for Parallel Computing . Springer , 213\u2013229. William Pugh and Tatiana Shpeisman. 1999. SIPR: A new framework for generating efficient code for sparse matrix computations. In Languages and Compilers for Parallel Computing. Springer, 213\u2013229."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133889"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3037697.3037745"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2480361.2371407"},{"key":"e_1_2_1_45_1","doi-asserted-by":"crossref","unstructured":"Yousef Saad. 2003. Iterative methods for sparse linear systems. SIAM. \t\t\t\t  Yousef Saad. 2003. Iterative methods for sparse linear systems. SIAM.","DOI":"10.1137\/1.9780898718003"},{"volume-title":"EvoGraph: On-the-Fly Efficient Mining of Evolving Graphs on GPU","author":"Sengupta Dipanjan","key":"e_1_2_1_46_1","unstructured":"Dipanjan Sengupta and Shuaiwen Leon Song . 2017. EvoGraph: On-the-Fly Efficient Mining of Evolving Graphs on GPU . In High Performance Computing, Julian M. Kunkel, Rio Yokota, Pavan Balaji, and David Keyes (Eds.). Springer International Publishing , Cham . 97\u2013119. isbn:978-3-319-58667-0 Dipanjan Sengupta and Shuaiwen Leon Song. 2017. EvoGraph: On-the-Fly Efficient Mining of Evolving Graphs on GPU. In High Performance Computing, Julian M. Kunkel, Rio Yokota, Pavan Balaji, and David Keyes (Eds.). Springer International Publishing, Cham. 97\u2013119. isbn:978-3-319-58667-0"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.14778\/3151113.3151122"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/2517327.2442530"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2025113.2025153"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/2833179.2833183"},{"key":"e_1_2_1_51_1","first-page":"1530","volume-title":"SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication. In 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 61\u201370","author":"Smith Shaden","year":"2015","unstructured":"Shaden Smith , Niranjay Ravindran , Nicholas Sidiropoulos , and George Karypis . 2015 . SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication. In 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 61\u201370 . issn: 1530 - 2075 Shaden Smith, Niranjay Ravindran, Nicholas Sidiropoulos, and George Karypis. 2015. SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication. In 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 61\u201370. issn:1530-2075"},{"volume-title":"A Relational Approach to the Automatic Generation of Sequential Sparse Matrix Codes. Ph. D. Dissertation","author":"Stodghill Paul","key":"e_1_2_1_52_1","unstructured":"Paul Stodghill . 1997. A Relational Approach to the Automatic Generation of Sequential Sparse Matrix Codes. Ph. D. Dissertation . Cornell University . Paul Stodghill. 1997. A Relational Approach to the Automatic Generation of Sequential Sparse Matrix Codes. Ph. D. Dissertation. Cornell University."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/3200691.3178509"},{"key":"#cr-split#-e_1_2_1_54_1.1","unstructured":"Ruiqin Tian Luanzheng Guo Jiajia Li Bin Ren and Gokcen Kestor. 2021. A High-Performance Sparse Tensor Algebra Compiler in Multi-Level IR. https:\/\/doi.org\/10.48550\/ARXIV.2102.05187 10.48550\/ARXIV.2102.05187"},{"key":"#cr-split#-e_1_2_1_54_1.2","doi-asserted-by":"crossref","unstructured":"Ruiqin Tian Luanzheng Guo Jiajia Li Bin Ren and Gokcen Kestor. 2021. A High-Performance Sparse Tensor Algebra Compiler in Multi-Level IR. https:\/\/doi.org\/10.48550\/ARXIV.2102.05187","DOI":"10.1109\/LLVMHPC54804.2021.00009"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/2737924.2738003"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPEC.2017.8091058"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3168818"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2017.8257937"}],"container-title":["Proceedings of the ACM on Programming Languages"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3563338","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3563338","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3563338","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:38:10Z","timestamp":1750178290000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3563338"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,31]]},"references-count":59,"journal-issue":{"issue":"OOPSLA2","published-print":{"date-parts":[[2022,10,31]]}},"alternative-id":["10.1145\/3563338"],"URL":"https:\/\/doi.org\/10.1145\/3563338","relation":{},"ISSN":["2475-1421"],"issn-type":[{"type":"electronic","value":"2475-1421"}],"subject":[],"published":{"date-parts":[[2022,10,31]]},"assertion":[{"value":"2022-10-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}