{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,30]],"date-time":"2026-01-30T08:26:26Z","timestamp":1769761586404,"version":"3.49.0"},"reference-count":31,"publisher":"Wiley","issue":"12","license":[{"start":{"date-parts":[[2022,8,24]],"date-time":"2022-08-24T00:00:00Z","timestamp":1661299200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Softw Pract Exp"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Recent works showed that implementations of quicksort using vector CPU instructions can outperform the non\u2010vectorized algorithms in widespread use. However, these implementations are typically single\u2010threaded, implemented for a particular instruction set, and restricted to a small set of key types. We lift these three restrictions: our proposed<jats:italic>vqsort<\/jats:italic>algorithm integrates into the state\u2010of\u2010the\u2010art parallel sorter , with a geometric mean speedup of 1.59. The same implementation works on seven instruction sets (including SVE and RISC\u2010V V) across four platforms. It also supports floating\u2010point and 16\u2013128 bit integer keys. To the best of our knowledge, this is the fastest sort for large arrays of non\u2010tuple keys on CPUs, up to 20 times as fast as the sorting algorithms implemented in standard libraries. This article focuses on the practical engineering aspects enabling the speed and portability, which we have not yet seen demonstrated for a quicksort implementation. Furthermore, we introduce compact and transpose\u2010free sorting networks for in\u2010register sorting of small arrays, and a vector\u2010friendly pivot sampling strategy that is robust against adversarial input.<\/jats:p>","DOI":"10.1002\/spe.3142","type":"journal-article","created":{"date-parts":[[2022,8,24]],"date-time":"2022-08-24T07:45:30Z","timestamp":1661327130000},"page":"2684-2699","update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Vectorized and performance\u2010portable quicksort"],"prefix":"10.1002","volume":"52","author":[{"given":"Jan","family":"Wassenberg","sequence":"first","affiliation":[{"name":"Brain Team Google Research Zurich Switzerland"}]},{"given":"Mark","family":"Blacher","sequence":"additional","affiliation":[{"name":"Theoretical Computer Science Friedrich Schiller University Jena Germany"}]},{"given":"Joachim","family":"Giesen","sequence":"additional","affiliation":[{"name":"Theoretical Computer Science Friedrich Schiller University Jena Germany"}]},{"given":"Peter","family":"Sanders","sequence":"additional","affiliation":[{"name":"Institute of Theoretical Informatics, Algorithm Engineering Karlsruhe Institute of Technology Karlsruhe Germany"}]}],"member":"311","published-online":{"date-parts":[[2022,8,24]]},"reference":[{"key":"e_1_2_10_2_1","doi-asserted-by":"crossref","unstructured":"InoueH.How SIMD width affects energy efficiency: a case study on sorting. Proceedings of the IEEE Symposium in Low\u2010Power and High\u2010Speed Chips (COOL CHIPS);2016:1\u20103; IEEE.","DOI":"10.1109\/CoolChips.2016.7503679"},{"key":"e_1_2_10_3_1","unstructured":"Valve Corporation.Steam hardware and software survey; 2021.https:\/\/store.steampowered.com\/hwsurvey\/Steam\u2010Hardware\u2010Software\u2010Survey\u2010Welcome\u2010to\u2010Steam"},{"key":"e_1_2_10_4_1","unstructured":"Personal communication ex\u2010ARM compiler engineer."},{"key":"e_1_2_10_5_1","doi-asserted-by":"publisher","DOI":"10.14778\/1454159.1454171"},{"key":"e_1_2_10_6_1","doi-asserted-by":"crossref","unstructured":"YinZ ZhangT M\u00fcllerA et al.Efficient parallel sort on AVX\u2010512\u2010based multi\u2010core and many\u2010core architectures. Proceedings of the IEEE International Conference on High Performance Computing and Communications; IEEE International Conference on Smart City; IEEE International Conference on Data Science and Systems (HPCC\/SmartCity\/DSS);2019:168\u2010176.","DOI":"10.1109\/HPCC\/SmartCity\/DSS.2019.00038"},{"key":"e_1_2_10_7_1","unstructured":"BlacherM GiesenJ KuehneL.Fast and robust vectorized in\u2010place sorting of primitive types. Proceedings of the Symposium on Experimental Algorithms (SEA);2021:3:1\u20103:16."},{"key":"e_1_2_10_8_1","doi-asserted-by":"crossref","unstructured":"ZaghaM BlellochGE.Radix sort for vector multiprocessors. Proceedings of the International Conference for High Performance Computing Networking Storage and Analysis (SC);1991:712\u2010721.","DOI":"10.1145\/125826.126164"},{"key":"e_1_2_10_9_1","doi-asserted-by":"crossref","unstructured":"WassenbergJ SandersP.Engineering a multi\u2010core radix sort. Proceedings of the International Conference on Parallel Processing (Euro\u2010Par);2011:160\u2010169.","DOI":"10.1007\/978-3-642-23397-5_16"},{"key":"e_1_2_10_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.1978.231484"},{"key":"e_1_2_10_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/0167-8191(90)90074-J"},{"issue":"1","key":"e_1_2_10_12_1","first-page":"83","article-title":"Fast quicksort implementation using AVX instructions","volume":"59","author":"Gueron S","year":"2016","journal-title":"Comput J"},{"key":"e_1_2_10_13_1","doi-asserted-by":"crossref","unstructured":"BramasB.A novel hybrid quicksort algorithm vectorized using AVX\u2010512 on intel skylake. CoRR 2017; abs\/1704.08579.","DOI":"10.14569\/IJACSA.2017.081044"},{"key":"e_1_2_10_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3505286"},{"key":"e_1_2_10_15_1","unstructured":"Google LLC.vqsort implementation; 2022.https:\/\/github.com\/google\/highway\/tree\/master\/hwy\/contrib\/sort"},{"key":"e_1_2_10_16_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-024X(19990410)29:4<341::AID-SPE237>3.0.CO;2-R"},{"key":"e_1_2_10_17_1","unstructured":"HartmannG.A numerical analysis of Quicksort: how many cases are bad cases? CoRR 2015; abs\/1507.04220."},{"key":"e_1_2_10_18_1","unstructured":"https:\/\/github.com\/numpy\/numpy\/issues\/16313."},{"key":"e_1_2_10_19_1","unstructured":"O'NeillM.Efficiently generating a number in a range; 2018.https:\/\/www.pcg\u2010random.org\/posts\/bounded\u2010rands.html"},{"key":"e_1_2_10_20_1","unstructured":"https:\/\/bugs.llvm.org\/show_bug.cgi?id=20837"},{"key":"e_1_2_10_21_1","doi-asserted-by":"crossref","unstructured":"DaoudAM Abdel\u2010jaberH AbabnehJ.Efficient non\u2010quadratic quick sort (NQQuickSort). Proceedings of the Conference on Digital Enterprise and Information Systems \u2010 International (DEIS);2011:667\u2010675.","DOI":"10.1007\/978-3-642-22603-8_58"},{"key":"e_1_2_10_22_1","unstructured":"WassenbergJ ObrykR AlakuijalaJ MogenetE.Randen \u2010 fast backtracking\u2010resistant random generator with AES+Feistel+Reverie. CoRR 2018; abs\/1810.02227."},{"key":"e_1_2_10_23_1","unstructured":"AlexandrescuA.Fast deterministic selection. Proceedings of the Symposium on Experimental Algorithms;2017:24:1\u201024:19."},{"key":"e_1_2_10_24_1","doi-asserted-by":"publisher","DOI":"10.1002\/spe.2922"},{"key":"e_1_2_10_25_1","unstructured":"Advanced Micro Devices Inc.AMD64 architecture programmer's manual; 2021.https:\/\/www.amd.com\/system\/files\/TechDocs\/40332.pdf"},{"key":"e_1_2_10_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2018.2789903"},{"key":"e_1_2_10_27_1","unstructured":"BatcherKE.Sorting networks and their applications. Proceedings of the Spring Joint Computer Conference;1968:307\u2010314."},{"key":"e_1_2_10_28_1","volume-title":"The Art of Computer Programming. Volume 3. Sorting and Searching","author":"Knuth DE","year":"1998"},{"key":"e_1_2_10_29_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2016.04.004"},{"key":"e_1_2_10_30_1","doi-asserted-by":"crossref","unstructured":"LimayeA AdegbijaT.A workload characterization of the SPEC CPU2017 benchmark suite. Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS);2018:149\u2010158.","DOI":"10.1109\/ISPASS.2018.00028"},{"key":"e_1_2_10_31_1","doi-asserted-by":"crossref","unstructured":"KushagraS L\u00f3pez\u2010OrtizA QiaoA MunroJI.Multi\u2010pivot quicksort: theory and experiments. Proceedings of the Workshop on Algorithm Engineering and Experiments (ALENEX);2014:47\u201060.","DOI":"10.1137\/1.9781611973198.6"},{"key":"e_1_2_10_32_1","unstructured":"damageboy. This goes to eleven.https:\/\/bits.houmus.org\/2020\u201001\u201028\/this\u2010goes\u2010to\u2010eleven\u2010pt1"}],"container-title":["Software: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/spe.3142","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/spe.3142","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/spe.3142","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,2]],"date-time":"2024-10-02T12:19:37Z","timestamp":1727871577000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/spe.3142"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,24]]},"references-count":31,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["10.1002\/spe.3142"],"URL":"https:\/\/doi.org\/10.1002\/spe.3142","archive":["Portico"],"relation":{},"ISSN":["0038-0644","1097-024X"],"issn-type":[{"value":"0038-0644","type":"print"},{"value":"1097-024X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,24]]},"assertion":[{"value":"2022-05-12","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-07-31","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-08-24","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}