{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T08:43:43Z","timestamp":1777106623585,"version":"3.51.4"},"reference-count":100,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2017,8,31]],"date-time":"2017-08-31T00:00:00Z","timestamp":1504137600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100008077","name":"EMC","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100008077","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006785","name":"Google","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006785","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100016822","name":"Seagate","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100016822","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100005801","name":"Facebook","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100005801","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003816","name":"Huawei Technologies","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100003816","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100016299","name":"NetApp","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100016299","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Veritas"},{"DOI":"10.13039\/100006112","name":"Microsoft","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006112","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CNS-1419199, CNS-1421033, CNS-1319405, CNS- 1218405"],"award-info":[{"award-number":["CNS-1419199, CNS-1421033, CNS-1319405, CNS- 1218405"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000015","name":"U.S. Department of Energy","doi-asserted-by":"publisher","award":["DE-SC0014935"],"award-info":[{"award-number":["DE-SC0014935"]}],"id":[{"id":"10.13039\/100000015","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100004358","name":"Samsung","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100004358","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100016682","name":"VMWare","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100016682","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Storage"],"published-print":{"date-parts":[[2017,8,31]]},"abstract":"<jats:p>We analyze how modern distributed storage systems behave in the presence of file-system faults such as data corruption and read and write errors. We characterize eight popular distributed storage systems and uncover numerous problems related to file-system fault tolerance. We find that modern distributed systems do not consistently use redundancy to recover from file-system faults: a single file-system fault can cause catastrophic outcomes such as data loss, corruption, and unavailability. We also find that the above outcomes arise due to fundamental problems in file-system fault handling that are common across many systems. Our results have implications for the design of next-generation fault-tolerant distributed and cloud storage systems.<\/jats:p>","DOI":"10.1145\/3125497","type":"journal-article","created":{"date-parts":[[2017,9,29]],"date-time":"2017-09-29T12:44:38Z","timestamp":1506689078000},"page":"1-33","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":23,"title":["Redundancy Does Not Imply Fault Tolerance"],"prefix":"10.1145","volume":"13","author":[{"given":"Aishwarya","family":"Ganesan","sequence":"first","affiliation":[{"name":"University of Wisconsin\u2014Madison, Madison, WI"}]},{"given":"Ramnatthan","family":"Alagappan","sequence":"additional","affiliation":[{"name":"University of Wisconsin\u2014Madison, Madison, WI"}]},{"given":"Andrea C.","family":"Arpaci-Dusseau","sequence":"additional","affiliation":[{"name":"University of Wisconsin\u2014Madison, Madison, WI"}]},{"given":"Remzi H.","family":"Arpaci-Dusseau","sequence":"additional","affiliation":[{"name":"University of Wisconsin\u2014Madison, Madison, WI"}]}],"member":"320","published-online":{"date-parts":[[2017,9,28]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Cords Tool and Results. 2017. Retrieved from http:\/\/research.cs.wisc.edu\/adsl\/Software\/cords\/.  Cords Tool and Results. 2017. Retrieved from http:\/\/research.cs.wisc.edu\/adsl\/Software\/cords\/."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems (HOTOS\u201915)","author":"Alagappan Ramnatthan","unstructured":"Ramnatthan Alagappan , Vijay Chidambaram , Thanumalayan Sankaranarayana Pillai , Aws Albarghouthi , Andrea C. Arpaci-Dusseau , and Remzi H . Arpaci-Dusseau. 2015. Beyond storage APIs: Provable semantics for storage stacks . In Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems (HOTOS\u201915) . Ramnatthan Alagappan, Vijay Chidambaram, Thanumalayan Sankaranarayana Pillai, Aws Albarghouthi, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2015. Beyond storage APIs: Provable semantics for storage stacks. In Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems (HOTOS\u201915)."},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916)","author":"Alagappan Ramnatthan","unstructured":"Ramnatthan Alagappan , Aishwarya Ganesan , Yuvraj Patel , Thanumalayan Sankaranarayana Pillai , Andrea C. Arpaci-Dusseau , and Remzi H . Arpaci-Dusseau. 2016. Correlated crash vulnerabilities . In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916) . Ramnatthan Alagappan, Aishwarya Ganesan, Yuvraj Patel, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. Correlated crash vulnerabilities. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916)."},{"key":"e_1_2_1_4_1","unstructured":"Apache. Cassandra. Retrieved from http:\/\/cassandra.apache.org\/.  Apache. Cassandra. Retrieved from http:\/\/cassandra.apache.org\/."},{"key":"e_1_2_1_5_1","unstructured":"Apache. Kakfa. Retrieved from http:\/\/kafka.apache.org\/.  Apache. Kakfa. Retrieved from http:\/\/kafka.apache.org\/."},{"key":"e_1_2_1_6_1","unstructured":"Apache. ZooKeeper. Retrieved from https:\/\/zookeeper.apache.org\/.  Apache. ZooKeeper. Retrieved from https:\/\/zookeeper.apache.org\/."},{"key":"e_1_2_1_7_1","volume-title":"Arpaci-Dusseau","author":"Arpaci-Dusseau Remzi H.","year":"2015","unstructured":"Remzi H. Arpaci-Dusseau and Andrea C . Arpaci-Dusseau . 2015 . Operating Systems : Three Easy Pieces (0.91 ed.). Arpaci-Dusseau Books . Remzi H. Arpaci-Dusseau and Andrea C. Arpaci-Dusseau. 2015. Operating Systems: Three Easy Pieces (0.91 ed.). Arpaci-Dusseau Books."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1416944.1416947"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1254882.1254917"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSN.2008.4630121"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.54853"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/502034.502042"},{"key":"e_1_2_1_14_1","unstructured":"CockroachDB. CockroachDB. Retrieved from https:\/\/www.cockroachlabs.com\/.  CockroachDB. CockroachDB. Retrieved from https:\/\/www.cockroachlabs.com\/."},{"key":"e_1_2_1_15_1","unstructured":"CockroachDB. Disk corruptions and read\/write error handling in CockroachDB. Retrieved from https:\/\/forum.cockroachlabs.com\/t\/disk-corruptions-and-read-write-error-handling-in-cockroachdb\/258.  CockroachDB. Disk corruptions and read\/write error handling in CockroachDB. Retrieved from https:\/\/forum.cockroachlabs.com\/t\/disk-corruptions-and-read-write-error-handling-in-cockroachdb\/258."},{"key":"e_1_2_1_16_1","unstructured":"CockroachDB. Resiliency to disk corruption and storage errors. Retrieved from https:\/\/github.com\/cockroachdb\/cockroach\/issues\/7882.  CockroachDB. Resiliency to disk corruption and storage errors. Retrieved from https:\/\/github.com\/cockroachdb\/cockroach\/issues\/7882."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/2342821.2342862"},{"key":"e_1_2_1_18_1","unstructured":"Data Center Knowledge. Ma.gnolia data is gone for good. Retrieved from http:\/\/www.datacenterknowledge.com\/archives\/2009\/02\/19\/magnolia-data-is-gone-for-good\/.  Data Center Knowledge. Ma.gnolia data is gone for good. Retrieved from http:\/\/www.datacenterknowledge.com\/archives\/2009\/02\/19\/magnolia-data-is-gone-for-good\/."},{"key":"e_1_2_1_19_1","unstructured":"Datastax. Netflix Cassandra Use Case. Retrieved from http:\/\/www.datastax.com\/resources\/casestudies\/netflix.  Datastax. Netflix Cassandra Use Case. Retrieved from http:\/\/www.datastax.com\/resources\/casestudies\/netflix."},{"key":"e_1_2_1_20_1","unstructured":"DataStax. Read Repair: Repair during Read Path. Retrieved from http:\/\/docs.datastax.com\/en\/cassandra\/3.0\/cassandra\/operations\/opsRepairNodesReadRepair.html.  DataStax. Read Repair: Repair during Read Path. Retrieved from http:\/\/docs.datastax.com\/en\/cassandra\/3.0\/cassandra\/operations\/opsRepairNodesReadRepair.html."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDS.1996.540200"},{"key":"e_1_2_1_22_1","unstructured":"Jeff Dean. Building Large-Scale Internet Services. Retrieved from http:\/\/static.googleusercontent.com\/media\/research.google.com\/en\/\/people\/jeff\/SOCC2010-keynote-slides.pdf.  Jeff Dean. Building Large-Scale Internet Services. Retrieved from http:\/\/static.googleusercontent.com\/media\/research.google.com\/en\/\/people\/jeff\/SOCC2010-keynote-slides.pdf."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1294261.1294281"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1516046.1516059"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.2172\/1081941"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2675113"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.5555\/2208461.2208468"},{"key":"e_1_2_1_28_1","unstructured":"FUSE. Linux FUSE (Filesystem in Userspace) interface. Retrieved from https:\/\/github.com\/libfuse\/libfuse.  FUSE. Linux FUSE (Filesystem in Userspace) interface. Retrieved from https:\/\/github.com\/libfuse\/libfuse."},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the 15th USENIX Symposium on File and Storage Technologies (FAST\u201917)","author":"Ganesan Aishwarya","unstructured":"Aishwarya Ganesan , Ramnatthan Alagappan , Andrea C. Arpaci-Dusseau , and Remzi H . Arpaci-Dusseau. 2017. Redundancy does not imply fault tolerance: Analysis of distributed storage reactions to single errors and corruptions . In Proceedings of the 15th USENIX Symposium on File and Storage Technologies (FAST\u201917) . 149--166. Aishwarya Ganesan, Ramnatthan Alagappan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2017. Redundancy does not imply fault tolerance: Analysis of distributed storage reactions to single errors and corruptions. In Proceedings of the 15th USENIX Symposium on File and Storage Technologies (FAST\u201917). 149--166."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/945445.945450"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2018436.2018477"},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the International Conference on Dependable Systems and Networks (DSN\u201903)","author":"Gu Weining","year":"2003","unstructured":"Weining Gu , Z. Kalbarczyk , Ravishankar K. Iyer , and Zhenyu Yang . 2003 . Characterization of linux kernel behavior under errors . In Proceedings of the International Conference on Dependable Systems and Networks (DSN\u201903) . Weining Gu, Z. Kalbarczyk, Ravishankar K. Iyer, and Zhenyu Yang. 2003. Characterization of linux kernel behavior under errors. In Proceedings of the International Conference on Dependable Systems and Networks (DSN\u201903)."},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the ACM Symposium on Cloud Computing (SOCC\u201914)","author":"Gunawi Haryadi S.","unstructured":"Haryadi S. Gunawi , Mingzhe Hao , Tanakorn Leesatapornwongsa , Tiratat Patana-anake, Thanh Do , Jeffry Adityatama , Kurnia J. Eliazar , Agung Laksono , Jeffrey F. Lukman , Vincentius Martin , and Anang D. Satria . 2014. What bugs live in the cloud? A study of 3000+ issues in cloud systems . In Proceedings of the ACM Symposium on Cloud Computing (SOCC\u201914) . Haryadi S. Gunawi, Mingzhe Hao, Tanakorn Leesatapornwongsa, Tiratat Patana-anake, Thanh Do, Jeffry Adityatama, Kurnia J. Eliazar, Agung Laksono, Jeffrey F. Lukman, Vincentius Martin, and Anang D. Satria. 2014. What bugs live in the cloud? A study of 3000+ issues in cloud systems. In Proceedings of the ACM Symposium on Cloud Computing (SOCC\u201914)."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2043556.2043582"},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the 21st Annual Large Installation System Administration Conference (LISA\u201907)","author":"James","unstructured":"James R. Hamilton and others. 2007. On designing and deploying internet-scale services . In Proceedings of the 21st Annual Large Installation System Administration Conference (LISA\u201907) . James R. Hamilton and others. 2007. On designing and deploying internet-scale services. In Proceedings of the 21st Annual Large Installation System Administration Conference (LISA\u201907)."},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the International Computer Performance and Dependability Symposium (IPDS\u201995)","author":"Han Seungjae","unstructured":"Seungjae Han , Kang G. Shin , and Harold A. Rosenberg . 1995. DOCTOR: An integrated software fault injection environment for distributed real-time systems . In Proceedings of the International Computer Performance and Dependability Symposium (IPDS\u201995) . Seungjae Han, Kang G. Shin, and Harold A. Rosenberg. 1995. DOCTOR: An integrated software fault injection environment for distributed real-time systems. In Proceedings of the International Computer Performance and Dependability Symposium (IPDS\u201995)."},{"key":"e_1_2_1_38_1","unstructured":"James Myers. Data Integrity in Solid State Drives. Retrieved from http:\/\/intel.ly\/2cF0dTT.  James Myers. Data Integrity in Solid State Drives. Retrieved from http:\/\/intel.ly\/2cF0dTT."},{"key":"e_1_2_1_39_1","unstructured":"Jerome Verstrynge. Timestamps in Cassandra. Retrieved from http:\/\/docs.oracle.com\/cd\/B12037_01\/server.101\/b10726\/apphard.htm.  Jerome Verstrynge. Timestamps in Cassandra. Retrieved from http:\/\/docs.oracle.com\/cd\/B12037_01\/server.101\/b10726\/apphard.htm."},{"key":"e_1_2_1_40_1","unstructured":"Kafka. Data corruption or EIO leads to data loss. https:\/\/issues.apache.org\/jira\/browse\/KAFKA-4009.  Kafka. Data corruption or EIO leads to data loss. https:\/\/issues.apache.org\/jira\/browse\/KAFKA-4009."},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 3rd USENIX Symposium on File and Storage Technologies (FAST\u201904)","author":"Keeton Kimberley","year":"2004","unstructured":"Kimberley Keeton , Cipriano Santos , Dirk Beyer , Jeffrey Chase , and John Wilkes . 2004 . Designing for disasters . In Proceedings of the 3rd USENIX Symposium on File and Storage Technologies (FAST\u201904) . Kimberley Keeton, Cipriano Santos, Dirk Beyer, Jeffrey Chase, and John Wilkes. 2004. Designing for disasters. In Proceedings of the 3rd USENIX Symposium on File and Storage Technologies (FAST\u201904)."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/378993.379239"},{"key":"e_1_2_1_43_1","unstructured":"Kyle Kingsbury. Jepsen. Retrieved from http:\/\/jepsen.io\/.  Kyle Kingsbury. Jepsen. Retrieved from http:\/\/jepsen.io\/."},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI\u201914)","author":"Leesatapornwongsa Tanakorn","unstructured":"Tanakorn Leesatapornwongsa , Mingzhe Hao , Pallavi Joshi , Jeffrey F. Lukman , and Haryadi S. Gunawi . 2014. SAMC: Semantic-aware model checking for fast discovery of deep bugs in cloud systems . In Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI\u201914) . Tanakorn Leesatapornwongsa, Mingzhe Hao, Pallavi Joshi, Jeffrey F. Lukman, and Haryadi S. Gunawi. 2014. SAMC: Semantic-aware model checking for fast discovery of deep bugs in cloud systems. In Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI\u201914)."},{"key":"e_1_2_1_45_1","volume-title":"Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916)","author":"Liu Shengyun","year":"2016","unstructured":"Shengyun Liu , Paolo Viotti , Christian Cachin , Vivien Qu\u00e9ma , and Marko Vukolic . 2016 . XFT: Practical fault tolerance beyond crashes . In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916) . Shengyun Liu, Paolo Viotti, Christian Cachin, Vivien Qu\u00e9ma, and Marko Vukolic. 2016. XFT: Practical fault tolerance beyond crashes. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916)."},{"key":"e_1_2_1_46_1","unstructured":"LogCabin. LogCabin. Retrieved from https:\/\/github.com\/logcabin\/logcabin.  LogCabin. LogCabin. Retrieved from https:\/\/github.com\/logcabin\/logcabin."},{"key":"e_1_2_1_47_1","unstructured":"LogCabin. Reaction to disk errors and corruptions. Retrieved from https:\/\/groups.google.com\/forum\/#&excl;topic\/logcabin-dev\/wqNcdj0IHe4.  LogCabin. Reaction to disk errors and corruptions. Retrieved from https:\/\/groups.google.com\/forum\/#&excl;topic\/logcabin-dev\/wqNcdj0IHe4."},{"key":"e_1_2_1_48_1","unstructured":"Mark Adler. Adler32 Collisions. Retrieved from http:\/\/stackoverflow.com\/questions\/13455067\/horrific-collisions-of-adler32-hash.  Mark Adler. Adler32 Collisions. Retrieved from http:\/\/stackoverflow.com\/questions\/13455067\/horrific-collisions-of-adler32-hash."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2745844.2745848"},{"key":"e_1_2_1_50_1","volume-title":"Proceedings of the International Conference on Dependable Systems and Networks (DSN\u201908)","author":"Mi Ningfang","unstructured":"Ningfang Mi , A. Riska , E. Smirni , and E. Riedel . 2008. Enhancing data availability in disk drives through background activities . In Proceedings of the International Conference on Dependable Systems and Networks (DSN\u201908) , Anchorage, Alaska. Ningfang Mi, A. Riska, E. Smirni, and E. Riedel. 2008. Enhancing data availability in disk drives through background activities. In Proceedings of the International Conference on Dependable Systems and Networks (DSN\u201908), Anchorage, Alaska."},{"key":"e_1_2_1_51_1","unstructured":"Michael Rubin. Google moves from ext2 to ext4. Retrieved from http:\/\/lists.openwall.net\/linux-ext4\/2010\/01\/04\/8.  Michael Rubin. Google moves from ext2 to ext4. Retrieved from http:\/\/lists.openwall.net\/linux-ext4\/2010\/01\/04\/8."},{"key":"e_1_2_1_52_1","unstructured":"MongoDB. MongoDB. Retrieved from https:\/\/www.mongodb.org\/.  MongoDB. MongoDB. Retrieved from https:\/\/www.mongodb.org\/."},{"key":"e_1_2_1_53_1","unstructured":"MongoDB. MongoDB at eBay. Retrieved from https:\/\/www.mongodb.com\/presentations\/mongodb-ebay.  MongoDB. MongoDB at eBay. Retrieved from https:\/\/www.mongodb.com\/presentations\/mongodb-ebay."},{"key":"e_1_2_1_54_1","unstructured":"MongoDB. MongoDB WiredTiger. Retrieved from https:\/\/docs.mongodb.org\/manual\/core\/wiredtiger\/.  MongoDB. MongoDB WiredTiger. Retrieved from https:\/\/docs.mongodb.org\/manual\/core\/wiredtiger\/."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/2928275.2928278"},{"key":"e_1_2_1_56_1","unstructured":"Netflix. Cassandra at Netflix. Retrieved from http:\/\/techblog.netflix.com\/2011\/11\/benchmarking-cassandra-scalability-on.html.  Netflix. Cassandra at Netflix. Retrieved from http:\/\/techblog.netflix.com\/2011\/11\/benchmarking-cassandra-scalability-on.html."},{"key":"e_1_2_1_57_1","unstructured":"Oracle. Fusion-IO Data Integrity. Retrieved from https:\/\/blogs.oracle.com\/linux\/entry\/fusion_io_showcases_data_integrity.  Oracle. Fusion-IO Data Integrity. Retrieved from https:\/\/blogs.oracle.com\/linux\/entry\/fusion_io_showcases_data_integrity."},{"key":"e_1_2_1_58_1","unstructured":"Oracle. Preventing Data Corruptions with HARD. Retrieved from http:\/\/docs.oracle.com\/cd\/B12037_01\/server.101\/b10726\/apphard.htm.  Oracle. Preventing Data Corruptions with HARD. Retrieved from http:\/\/docs.oracle.com\/cd\/B12037_01\/server.101\/b10726\/apphard.htm."},{"key":"e_1_2_1_59_1","article-title":"The log-structured merge-tree (LSM-tree)","volume":"33","author":"Neil Patrick","year":"1996","unstructured":"Patrick O Neil , Edward Cheng , Dieter Gawlick , and Elizabeth O Neil . 1996 . The log-structured merge-tree (LSM-tree) . Acta Inform. 33 , 4 (1996). Patrick ONeil, Edward Cheng, Dieter Gawlick, and Elizabeth ONeil. 1996. The log-structured merge-tree (LSM-tree). Acta Inform. 33, 4 (1996).","journal-title":"Acta Inform."},{"key":"e_1_2_1_60_1","volume-title":"Data integrity. CERN\/IT","author":"Panzer-Steindel Bernd","year":"2007","unstructured":"Bernd Panzer-Steindel . 2007. Data integrity. CERN\/IT ( 2007 ). Bernd Panzer-Steindel. 2007. Data integrity. CERN\/IT (2007)."},{"key":"e_1_2_1_61_1","volume-title":"Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI\u201914)","author":"Pillai Thanumalayan Sankaranarayana","unstructured":"Thanumalayan Sankaranarayana Pillai , Vijay Chidambaram , Ramnatthan Alagappan , Samer Al-Kiswany , Andrea C. Arpaci-Dusseau , and Remzi H . Arpaci-Dusseau. 2014. All file systems are not created equal: On the complexity of crafting crash-consistent applications . In Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI\u201914) . Thanumalayan Sankaranarayana Pillai, Vijay Chidambaram, Ramnatthan Alagappan, Samer Al-Kiswany, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. All file systems are not created equal: On the complexity of crafting crash-consistent applications. In Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI\u201914)."},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSN.2005.65"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1145\/1095810.1095830"},{"key":"e_1_2_1_64_1","unstructured":"Rahul Bhartia. MongoDB on AWS Guidelines and Best Practices. Retrieved from http:\/\/media.amazonwebservices.com\/AWS_NoSQL_MongoDB.pdf.  Rahul Bhartia. MongoDB on AWS Guidelines and Best Practices. Retrieved from http:\/\/media.amazonwebservices.com\/AWS_NoSQL_MongoDB.pdf."},{"key":"e_1_2_1_65_1","unstructured":"Redis. Instagram Architecture. Retrieved from http:\/\/highscalability.com\/blog\/2012\/4\/9\/the-instagram-architecture-facebook-bought-for-a-cool-billio.html.  Redis. Instagram Architecture. Retrieved from http:\/\/highscalability.com\/blog\/2012\/4\/9\/the-instagram-architecture-facebook-bought-for-a-cool-billio.html."},{"key":"e_1_2_1_66_1","unstructured":"Redis. Redis. Retrieved from http:\/\/redis.io\/.  Redis. Redis. Retrieved from http:\/\/redis.io\/."},{"key":"e_1_2_1_67_1","unstructured":"Redis. Redis at Flickr. Retrieved from http:\/\/code.flickr.net\/2014\/07\/31\/redis-sentinel-at-flickr\/.  Redis. Redis at Flickr. Retrieved from http:\/\/code.flickr.net\/2014\/07\/31\/redis-sentinel-at-flickr\/."},{"key":"e_1_2_1_68_1","unstructured":"Redis. Silent data corruption in Redis. Retrieved from https:\/\/github.com\/antirez\/redis\/issues\/3730.  Redis. Silent data corruption in Redis. Retrieved from https:\/\/github.com\/antirez\/redis\/issues\/3730."},{"key":"e_1_2_1_69_1","unstructured":"RethinkDB. Integrity of read results. Retrieved from https:\/\/github.com\/rethinkdb\/rethinkdb\/issues\/5925.  RethinkDB. Integrity of read results. Retrieved from https:\/\/github.com\/rethinkdb\/rethinkdb\/issues\/5925."},{"key":"e_1_2_1_70_1","unstructured":"RethinkDB. RethinkDB. Retrieved from https:\/\/www.rethinkdb.com\/.  RethinkDB. RethinkDB. Retrieved from https:\/\/www.rethinkdb.com\/."},{"key":"e_1_2_1_71_1","unstructured":"RethinkDB. RethinkDB Data Storage. Retrieved from https:\/\/www.rethinkdb.com\/docs\/architecture\/#data-storage.  RethinkDB. RethinkDB Data Storage. Retrieved from https:\/\/www.rethinkdb.com\/docs\/architecture\/#data-storage."},{"key":"e_1_2_1_72_1","unstructured":"RethinkDB. RethinkDB Doc Issues. Retrieved from https:\/\/github.com\/rethinkdb\/docs\/issues\/1167.  RethinkDB. RethinkDB Doc Issues. Retrieved from https:\/\/github.com\/rethinkdb\/docs\/issues\/1167."},{"key":"e_1_2_1_73_1","unstructured":"RethinkDB. RethinkDB Faq. Retrieved from https:\/\/www.rethinkdb.com\/faq\/.  RethinkDB. RethinkDB Faq. Retrieved from https:\/\/www.rethinkdb.com\/faq\/."},{"key":"e_1_2_1_74_1","unstructured":"RethinkDB. Silent data loss on metablock corruptions. Retrieved from https:\/\/github.com\/rethinkdb\/rethinkdb\/issues\/6034.  RethinkDB. Silent data loss on metablock corruptions. Retrieved from https:\/\/github.com\/rethinkdb\/rethinkdb\/issues\/6034."},{"key":"e_1_2_1_75_1","volume-title":"Introducing CloudLab: Scientific infrastructure for advancing cloud architectures and applications. USENIX ;login: 39, 6","author":"Ricci Robert","year":"2014","unstructured":"Robert Ricci , Eric Eide , and CloudLab Team . 2014. Introducing CloudLab: Scientific infrastructure for advancing cloud architectures and applications. USENIX ;login: 39, 6 ( 2014 ). Robert Ricci, Eric Eide, and CloudLab Team. 2014. Introducing CloudLab: Scientific infrastructure for advancing cloud architectures and applications. USENIX ;login: 39, 6 (2014)."},{"key":"e_1_2_1_76_1","unstructured":"Robert Harris. Data corruption is worse than you know. Retrieved from http:\/\/www.zdnet.com\/article\/data-corruption-is-worse-than-you-know\/.  Robert Harris. Data corruption is worse than you know. Retrieved from http:\/\/www.zdnet.com\/article\/data-corruption-is-worse-than-you-know\/."},{"key":"e_1_2_1_77_1","unstructured":"Ron Kuris. Cassandra From tarball to production. Retrieved from http:\/\/www.slideshare.net\/planetcassandra\/cassandra-from-tarball-to-production-2.  Ron Kuris. Cassandra From tarball to production. Retrieved from http:\/\/www.slideshare.net\/planetcassandra\/cassandra-from-tarball-to-production-2."},{"key":"e_1_2_1_78_1","doi-asserted-by":"publisher","DOI":"10.1145\/146941.146943"},{"key":"e_1_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.1145\/357401.357402"},{"key":"e_1_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.1145\/1837915.1837917"},{"key":"e_1_2_1_81_1","volume-title":"Proceedings of the 5th USENIX Symposium on File and Storage Technologies (FAST\u201907)","author":"Schroeder Bianca","unstructured":"Bianca Schroeder and Garth A. Gibson . 2007. Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? In Proceedings of the 5th USENIX Symposium on File and Storage Technologies (FAST\u201907) . Bianca Schroeder and Garth A. Gibson. 2007. Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? In Proceedings of the 5th USENIX Symposium on File and Storage Technologies (FAST\u201907)."},{"key":"e_1_2_1_82_1","volume-title":"Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST\u201916)","author":"Schroeder Bianca","year":"2016","unstructured":"Bianca Schroeder , Raghav Lagisetty , and Arif Merchant . 2016 . Flash reliability in production: The expected and the unexpected . In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST\u201916) . Bianca Schroeder, Raghav Lagisetty, and Arif Merchant. 2016. Flash reliability in production: The expected and the unexpected. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST\u201916)."},{"key":"e_1_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1109\/FTCS.1993.627311"},{"key":"e_1_2_1_84_1","doi-asserted-by":"publisher","DOI":"10.1145\/1103780.1103784"},{"key":"e_1_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1019175717085"},{"key":"e_1_2_1_86_1","doi-asserted-by":"publisher","DOI":"10.1145\/2694344.2694348"},{"key":"e_1_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDS.2000.839467"},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2010.5447821"},{"key":"e_1_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.1145\/945445.945466"},{"key":"e_1_2_1_90_1","doi-asserted-by":"publisher","DOI":"10.1007\/BFb0024305"},{"key":"e_1_2_1_91_1","unstructured":"Twitter. Kafka at Twitter. Retrieved from https:\/\/blog.twitter.com\/2015\/handling-five-billion-sessions-a-day-in-real-time.  Twitter. Kafka at Twitter. Retrieved from https:\/\/blog.twitter.com\/2015\/handling-five-billion-sessions-a-day-in-real-time."},{"key":"e_1_2_1_92_1","unstructured":"Uber. The Uber Engineering Tech Stack Part I: The Foundation. Retrieved from https:\/\/eng.uber.com\/tech-stack-part-one\/.  Uber. The Uber Engineering Tech Stack Part I: The Foundation. Retrieved from https:\/\/eng.uber.com\/tech-stack-part-one\/."},{"key":"e_1_2_1_93_1","volume-title":"The Uber Engineering Tech Stack","unstructured":"Uber. The Uber Engineering Tech Stack , Part II: The Edge And Beyond. Retrieved from https:\/\/eng.uber.com\/tech-stack-part-two\/. Uber. The Uber Engineering Tech Stack, Part II: The Edge And Beyond. Retrieved from https:\/\/eng.uber.com\/tech-stack-part-two\/."},{"key":"e_1_2_1_94_1","unstructured":"Voldemort. Project Voldemort. http:\/\/www.project-voldemort.com\/voldemort\/.  Voldemort. Project Voldemort. http:\/\/www.project-voldemort.com\/voldemort\/."},{"key":"e_1_2_1_95_1","volume-title":"Proceedings of the 10th Symposium on Networked Systems Design and Implementation (NSDI\u201913)","author":"Wang Yang","year":"2013","unstructured":"Yang Wang , Manos Kapritsos , Zuocheng Ren , Prince Mahajan , Jeevitha Kirubanandam , Lorenzo Alvisi , and Mike Dahlin . 2013 . Robustness in the salus scalable block store . In Proceedings of the 10th Symposium on Networked Systems Design and Implementation (NSDI\u201913) . Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike Dahlin. 2013. Robustness in the salus scalable block store. In Proceedings of the 10th Symposium on Networked Systems Design and Implementation (NSDI\u201913)."},{"key":"e_1_2_1_96_1","volume-title":"Proceedings of the 6th Symposium on Networked Systems Design and Implementation (NSDI\u201909)","author":"Yang Junfeng","year":"2009","unstructured":"Junfeng Yang , Tisheng Chen , Ming Wu , Zhilei Xu , Xuezheng Liu , Haoxiang Lin , Mao Yang , Fan Long , Lintao Zhang , and Lidong Zhou . 2009 . MODIST: Transparent model checking of unmodified distributed systems . In Proceedings of the 6th Symposium on Networked Systems Design and Implementation (NSDI\u201909) . Junfeng Yang, Tisheng Chen, Ming Wu, Zhilei Xu, Xuezheng Liu, Haoxiang Lin, Mao Yang, Fan Long, Lintao Zhang, and Lidong Zhou. 2009. MODIST: Transparent model checking of unmodified distributed systems. In Proceedings of the 6th Symposium on Networked Systems Design and Implementation (NSDI\u201909)."},{"key":"e_1_2_1_97_1","volume-title":"Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI\u201914)","author":"Yuan Ding","year":"2014","unstructured":"Ding Yuan , Yu Luo , Xin Zhuang , Guilherme Renna Rodrigues , Xu Zhao , Yongle Zhang , Pranay U. Jain , and Michael Stumm . 2014 . Simple testing can prevent most critical failures: An analysis of production failures in distributed data-intensive systems . In Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI\u201914) . Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Renna Rodrigues, Xu Zhao, Yongle Zhang, Pranay U. Jain, and Michael Stumm. 2014. Simple testing can prevent most critical failures: An analysis of production failures in distributed data-intensive systems. In Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI\u201914)."},{"key":"e_1_2_1_98_1","volume-title":"Proceedings of the 12th USENIX Symposium on File and Storage Technologies (FAST\u201914)","author":"Zhang Yupu","year":"2014","unstructured":"Yupu Zhang , Chris Dragga , Andrea C. Arpaci-Dusseau , Remzi H. Arpaci-Dusseau . 2014 . ViewBox: Integrating local file systems with cloud storage services . In Proceedings of the 12th USENIX Symposium on File and Storage Technologies (FAST\u201914) . Yupu Zhang, Chris Dragga, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau. 2014. ViewBox: Integrating local file systems with cloud storage services. In Proceedings of the 12th USENIX Symposium on File and Storage Technologies (FAST\u201914)."},{"key":"e_1_2_1_99_1","volume-title":"Proceedings of the 8th USENIX Symposium on File and Storage Technologies (FAST\u201910)","author":"Zhang Yupu","unstructured":"Yupu Zhang , Abhishek Rajimwale , Andrea C. Arpaci-Dusseau , and Remzi H . Arpaci-Dusseau. 2010. End-to-end data integrity for file systems: A ZFS case study . In Proceedings of the 8th USENIX Symposium on File and Storage Technologies (FAST\u201910) . Yupu Zhang, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2010. End-to-end data integrity for file systems: A ZFS case study. In Proceedings of the 8th USENIX Symposium on File and Storage Technologies (FAST\u201910)."},{"key":"e_1_2_1_100_1","unstructured":"ZooKeeper. Cluster unavailable on space and write errors. Retrieved from https:\/\/issues.apache.org\/jira\/browse\/ZOOKEEPER-2495.  ZooKeeper. Cluster unavailable on space and write errors. Retrieved from https:\/\/issues.apache.org\/jira\/browse\/ZOOKEEPER-2495."},{"key":"e_1_2_1_101_1","unstructured":"ZooKeeper. Crash on detecting a corruption. Retrieved from http:\/\/mail-archives.apache.org\/mod_mbox\/zookeeper-dev\/201701.mbox\/browser.  ZooKeeper. Crash on detecting a corruption. Retrieved from http:\/\/mail-archives.apache.org\/mod_mbox\/zookeeper-dev\/201701.mbox\/browser."},{"key":"e_1_2_1_102_1","unstructured":"ZooKeeper. Zookeeper service becomes unavailable when leader fails to write transaction log. Retrieved from https:\/\/issues.apache.org\/jira\/browse\/ZOOKEEPER-2247.  ZooKeeper. Zookeeper service becomes unavailable when leader fails to write transaction log. Retrieved from https:\/\/issues.apache.org\/jira\/browse\/ZOOKEEPER-2247."}],"container-title":["ACM Transactions on Storage"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3125497","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3125497","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3125497","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T02:11:23Z","timestamp":1750212683000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3125497"}},"subtitle":["Analysis of Distributed Storage Reactions to File-System Faults"],"short-title":[],"issued":{"date-parts":[[2017,8,31]]},"references-count":100,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2017,8,31]]}},"alternative-id":["10.1145\/3125497"],"URL":"https:\/\/doi.org\/10.1145\/3125497","relation":{},"ISSN":["1553-3077","1553-3093"],"issn-type":[{"value":"1553-3077","type":"print"},{"value":"1553-3093","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,8,31]]},"assertion":[{"value":"2017-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-09-28","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}