{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T15:35:03Z","timestamp":1776785703563,"version":"3.51.2"},"reference-count":49,"publisher":"Association for Computing Machinery (ACM)","issue":"FSE","funder":[{"name":"Natural Sciences and Engineering Research Council of Canada","award":["#RGPIN-2021-0390"],"award-info":[{"award-number":["#RGPIN-2021-0390"]}]},{"name":"Fonds de recherche du Qu\u00e9bec ? Nature et technologies","award":["#32686"],"award-info":[{"award-number":["#32686"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Softw. Eng."],"published-print":{"date-parts":[[2025,6,19]]},"abstract":"<jats:p>\n            Software logs, generated during the runtime of software systems, are essential for various development and analysis activities, such as anomaly detection and failure diagnosis. However, the presence of sensitive information in these logs poses significant privacy concerns, particularly regarding\n            <jats:italic toggle=\"yes\">Personally Identifiable Information (PII)<\/jats:italic>\n            and quasi-identifiers that could lead to re-identification risks. While general data privacy has been extensively studied, the specific domain of privacy in software logs remains underexplored, with inconsistent definitions of sensitivity and a lack of standardized guidelines for anonymization. To mitigate this gap, this study offers a comprehensive analysis of privacy in software logs from multiple perspectives. We start by performing an analysis of 25 publicly available log datasets to identify potentially sensitive attributes. Based on the result of this step, we focus on three perspectives: privacy regulations, research literature, and industry practices. We first analyze key data privacy regulations, such as the\n            <jats:italic toggle=\"yes\">General Data Protection Regulation (GDPR)<\/jats:italic>\n            and the\n            <jats:italic toggle=\"yes\">California Consumer Privacy Act (CCPA)<\/jats:italic>\n            , to understand the legal requirements concerning sensitive information in logs. Second, we conduct a systematic literature review to identify common privacy attributes and practices in log anonymization, revealing gaps in existing approaches. Finally, we survey 45 industry professionals to capture practical insights on log anonymization practices. Our findings shed light on various perspectives of log privacy and reveal industry challenges, such as technical and efficiency issues while highlighting the need for standardized guidelines. By combining insights from regulatory, academic, and industry perspectives, our study aims to provide a clearer framework for identifying and protecting sensitive information in software logs.\n          <\/jats:p>","DOI":"10.1145\/3715779","type":"journal-article","created":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:15:34Z","timestamp":1750346134000},"page":"1317-1338","source":"Crossref","is-referenced-by-count":2,"title":["Protecting Privacy in Software Logs: What Should Be Anonymized?"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9361-2369","authenticated-orcid":false,"given":"Roozbeh","family":"Aghili","sequence":"first","affiliation":[{"name":"Polytechnique Montr\u00e9al, Montr\u00e9al, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5441-6763","authenticated-orcid":false,"given":"Heng","family":"Li","sequence":"additional","affiliation":[{"name":"Polytechnique Montr\u00e9al, Montr\u00e9al, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5704-4173","authenticated-orcid":false,"given":"Foutse","family":"Khomh","sequence":"additional","affiliation":[{"name":"Polytechnique Montr\u00e9al, Montr\u00e9al, Canada"}]}],"member":"320","published-online":{"date-parts":[[2025,6,19]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10664-023-10382-z"},{"key":"e_1_2_1_2_1","volume-title":"2019 IEEE International Conference on Big Data (Big Data). IEEE, 1739\u20131748","author":"Agrawal Amey","year":"2019","unstructured":"Amey Agrawal, Abhishek Dixit, Namrata A Shettar, Darshil Kapadia, Vikram Agrawal, Rajat Gupta, and Rohit Karlupia. 2019. Delog: A high-performance privacy preserving log filtering framework. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 1739\u20131748."},{"key":"e_1_2_1_3_1","volume-title":"A Face Is Exposed for AOL Searcher No. 4417749. https:\/\/www.nytimes.com\/2006\/08\/09\/technology\/09aol.html Accessed on","author":"Barbaro Tom","year":"2024","unstructured":"Tom Barbaro, Michael; Zeller Jr. 2006. A Face Is Exposed for AOL Searcher No. 4417749. https:\/\/www.nytimes.com\/2006\/08\/09\/technology\/09aol.html Accessed on September 11, 2024."},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 38th International Conference on Software Engineering Companion. 92\u2013101","author":"Barik Titus","year":"2016","unstructured":"Titus Barik, Robert DeLine, Steven Drucker, and Danyel Fisher. 2016. The bones of the system: A case study of logging and telemetry at microsoft. In Proceedings of the 38th International Conference on Software Engineering Companion. 92\u2013101."},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1007\/s10664-024-10452-w","article-title":"A literature review and existing challenges on software logging practices: From the creation to the analysis of software logs","volume":"29","author":"Batoun Mohamed Amine","year":"2024","unstructured":"Mohamed Amine Batoun, Mohammed Sayagh, Roozbeh Aghili, Ali Ouni, and Heng Li. 2024. A literature review and existing challenges on software logging practices: From the creation to the analysis of software logs. Empirical Software Engineering 29, 4 (2024), 103.","journal-title":"Empirical Software Engineering"},{"key":"e_1_2_1_6_1","volume-title":"Artificial intelligence for it operations (aiops) workshop white paper. arXiv preprint arXiv:2101.06054","author":"Bogatinovski Jasmin","year":"2021","unstructured":"Jasmin Bogatinovski, Sasho Nedelkoski, Alexander Acker, Florian Schmidt, Thorsten Wittkopp, Soeren Becker, Jorge Cardoso, and Odej Kao. 2021. Artificial intelligence for it operations (aiops) workshop white paper. arXiv preprint arXiv:2101.06054 (2021)."},{"key":"e_1_2_1_7_1","unstructured":"T\u00f8nnes Brekne and Andr\u00e9 \u00c5rnes. 2005. Circumventing IP-address pseudonymization.. In Communications and Computer Networks. 43\u201348."},{"key":"e_1_2_1_8_1","volume-title":"International Workshop on Privacy Enhancing Technologies. Springer, 179\u2013196","author":"Brekne T\u00f8nnes","year":"2005","unstructured":"T\u00f8nnes Brekne, Andr\u00e9 \u00c5rnes, and Arne \u00d8sleb\u00f8. 2005. Anonymization of ip traffic monitoring data: Attacks on two prefix-preserving anonymization schemes and some proposed remedies. In International Workshop on Privacy Enhancing Technologies. Springer, 179\u2013196."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1672308.1672310"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2012.67"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1409220.1409222"},{"key":"e_1_2_1_12_1","volume-title":"https:\/\/www.iso.org\/standard\/27001 Accessed on","author":"International Organization for Standardization. 2022. ISO\/IEC 27001:2022.","year":"2024","unstructured":"International Organization for Standardization. 2022. ISO\/IEC 27001:2022. https:\/\/www.iso.org\/standard\/27001 Accessed on September 11, 2024."},{"key":"e_1_2_1_13_1","volume-title":"2007 Third International Conference on Security and Privacy in Communications Networks and the Workshops-SecureComm","author":"Foukarakis Michalis","year":"2007","unstructured":"Michalis Foukarakis, Demetres Antoniades, Spiros Antonatos, and Evangelos P Markatos. 2007. Flexible and high-performance anonymization of NetFlow records using anontool. In 2007 Third International Conference on Security and Privacy in Communications Networks and the Workshops-SecureComm 2007. IEEE, 33\u201338."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1519144.1519147"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2012.06.025"},{"key":"e_1_2_1_16_1","volume-title":"https:\/\/gdprhub.eu\/index.php?title=CJEU_-_C-582\/14_-_Patrick_Breyer Accessed on","author":"Patrick Breyer CJEU","year":"2024","unstructured":"GDPRhub. 2023. CJEU - C-582\/14 - Patrick Breyer. https:\/\/gdprhub.eu\/index.php?title=CJEU_-_C-582\/14_-_Patrick_Breyer Accessed on September 11, 2024."},{"key":"e_1_2_1_17_1","first-page":"4369","article-title":"PD-PAn","volume":"12","author":"Gu Xiaodan","year":"2023","unstructured":"Xiaodan Gu and Kai Dong. 2023. PD-PAn: Prefix-and Distribution-Preserving Internet of Things Traffic Anonymization. Electronics 12, 20 (2023), 4369.","journal-title":"Prefix-and Distribution-Preserving Internet of Things Traffic Anonymization. Electronics"},{"key":"e_1_2_1_18_1","volume-title":"2020 IEEE Symposium on Computers and Communications (ISCC). IEEE, 1\u20137.","author":"Han Chunjing","year":"2020","unstructured":"Chunjing Han, Kunkun Sun, Haina Tang, Yulei Wu, and Xiaodan Zhang. 2020. AFT-Anon: A scaling method for online trace anonymization based on anonymous flow tables. In 2020 IEEE Symposium on Computers and Communications (ISCC). IEEE, 1\u20137."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICWS.2017.13"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-016-0059-y"},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the 2009 ACM symposium on Applied Computing. 1286\u20131293","author":"King Justin","year":"2009","unstructured":"Justin King, Kiran Lakkaraju, and Adam Slagell. 2009. A taxonomy and adversarial model for attacks against network log anonymization. In Proceedings of the 2009 ACM symposium on Applied Computing. 1286\u20131293."},{"key":"e_1_2_1_22_1","volume-title":"Evidence-based software engineering and systematic reviews","author":"Kitchenham Barbara Ann","unstructured":"Barbara Ann Kitchenham, David Budgen, and Pearl Brereton. 2015. Evidence-based software engineering and systematic reviews. Vol. 4. CRC press."},{"key":"e_1_2_1_23_1","volume-title":"Guide to advanced empirical software engineering","author":"Kitchenham Barbara A","unstructured":"Barbara A Kitchenham and Shari L Pfleeger. 2008. Personal opinion surveys. In Guide to advanced empirical software engineering. Springer, 63\u201392."},{"key":"e_1_2_1_24_1","volume-title":"The Internet Traffic Archive. https:\/\/ita.ee.lbl.gov\/html\/traces.html Accessed on","author":"Lawrence Berkeley National Laboratory. 1998.","year":"2024","unstructured":"Lawrence Berkeley National Laboratory. 1998. The Internet Traffic Archive. https:\/\/ita.ee.lbl.gov\/html\/traces.html Accessed on September 11, 2024."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2020.2970422"},{"key":"e_1_2_1_26_1","volume-title":"TripleLP: Privacy-Preserving Log Parsing Based on Blockchain. In 2023 IEEE 14th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP). IEEE, 1\u20136.","author":"Li Teng","year":"2023","unstructured":"Teng Li, Shengkai Zhang, Zexu Dang, Yongcai Xiao, and Zhuo Ma. 2023. TripleLP: Privacy-Preserving Log Parsing Based on Blockchain. In 2023 IEEE 14th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP). IEEE, 1\u20136."},{"key":"e_1_2_1_27_1","volume-title":"International conference on telecommunication systems modeling and analysis","volume":"21","author":"Li Yifan","year":"2005","unstructured":"Yifan Li, Adam Slagell, Katherine Luo, and William Yurcik. 2005. Canine: A combined conversion and anonymization tool for processing netflows for security. In International conference on telecommunication systems modeling and analysis, Vol. 21. Citeseer."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2022.3175219"},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","first-page":"109465","DOI":"10.1016\/j.compeleceng.2024.109465","article-title":"A configurable anonymisation approach for network flow data: Balancing utility and privacy","volume":"118","author":"Manocchio Liam Daly","year":"2024","unstructured":"Liam Daly Manocchio, Siamak Layeghy, David Gwynne, and Marius Portmann. 2024. A configurable anonymisation approach for network flow data: Balancing utility and privacy. Computers and Electrical Engineering 118 (2024), 109465.","journal-title":"Computers and Electrical Engineering"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1851275.1851199"},{"key":"e_1_2_1_31_1","volume-title":"Kyoungwon Suh and Jim Kurose","author":"Michael Zink Yu Gu","year":"2008","unstructured":"Yu Gu Michael Zink, Kyoungwon Suh and Jim Kurose. 2008. YouTube Traces From the Campus Network. https:\/\/traces.cs.umass.edu\/index.php\/Network\/Network Accessed on September 11, 2024."},{"key":"e_1_2_1_32_1","volume-title":"https:\/\/ita.ee.lbl.gov\/html\/contrib\/tcpdpriv.html Accessed on","author":"Minshall Greg","year":"2024","unstructured":"Greg Minshall. 2005. TCPDPRIV. https:\/\/ita.ee.lbl.gov\/html\/contrib\/tcpdpriv.html Accessed on September 11, 2024."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3243734.3243809"},{"key":"e_1_2_1_34_1","volume-title":"How to break anonymity of the netflix prize dataset. arXiv preprint cs\/0610105","author":"Narayanan Arvind","year":"2006","unstructured":"Arvind Narayanan and Vitaly Shmatikov. 2006. How to break anonymity of the netflix prize dataset. arXiv preprint cs\/0610105 (2006)."},{"key":"e_1_2_1_35_1","volume-title":"California Consumer Privacy Act. https:\/\/oag.ca.gov\/privacy\/ccpa Accessed on","author":"U.S. State of California. 2018.","year":"2024","unstructured":"U.S. State of California. 2018. California Consumer Privacy Act. https:\/\/oag.ca.gov\/privacy\/ccpa Accessed on September 11, 2024."},{"key":"e_1_2_1_36_1","unstructured":"Privacy Commissioner of Canada. 2000. Personal Information Protection and Electronic Documents Act. https:\/\/www.priv.gc.ca\/en\/privacy-topics\/privacy-laws-in-canada\/the-personal-information-protection-and-electronic-documents-act-pipeda\/ Accessed on September 11 2024."},{"key":"e_1_2_1_37_1","volume-title":"Health Insurance Portability and Accountability Act. https:\/\/www.hhs.gov\/hipaa\/index.html Accessed on","author":"U.S. Department of Health and Human Services. 1996.","year":"2024","unstructured":"U.S. Department of Health and Human Services. 1996. Health Insurance Portability and Accountability Act. https:\/\/www.hhs.gov\/hipaa\/index.html Accessed on September 11, 2024."},{"key":"e_1_2_1_38_1","volume-title":"ip2anonip. https:\/\/pages.cs.wisc.edu\/ Accessed on","author":"Plonka Dave","year":"2024","unstructured":"Dave Plonka. 2003. ip2anonip. https:\/\/pages.cs.wisc.edu\/ Accessed on September 11, 2024."},{"key":"e_1_2_1_39_1","volume-title":"Preprocessing is All You Need: Boosting the Performance of Log Parsers With a General Preprocessing Framework. arXiv preprint arXiv:2412.05254","author":"Qin Qiaolin","year":"2024","unstructured":"Qiaolin Qin, Roozbeh Aghili, Heng Li, and Ettore Merlo. 2024. Preprocessing is All You Need: Boosting the Performance of Log Parsers With a General Preprocessing Framework. arXiv preprint arXiv:2412.05254 (2024)."},{"key":"e_1_2_1_40_1","volume-title":"Privacy in the cloud: A survey of existing solutions and research challenges","author":"Silva Paulo","year":"2021","unstructured":"Paulo Silva, Edmundo Monteiro, and Paulo Simoes. 2021. Privacy in the cloud: A survey of existing solutions and research challenges. IEEE access 9 (2021), 10473\u201310497."},{"key":"e_1_2_1_41_1","volume-title":"Workshop of the 1st International Conference on Security and Privacy for Emerging Areas in Communication Networks","author":"Slagell Adam","year":"2005","unstructured":"Adam Slagell and William Yurcik. 2005. Sharing computer network logs for security and privacy: A motivation for new methodologies of anonymization. In Workshop of the 1st International Conference on Security and Privacy for Emerging Areas in Communication Networks, 2005. IEEE, 80\u201389."},{"key":"e_1_2_1_42_1","first-page":"3","article-title":"FLAIM: A Multi-level Anonymization Framework for Computer and Network Logs","volume":"6","author":"Slagell Adam J","year":"2006","unstructured":"Adam J Slagell, Kiran Lakkaraju, and Katherine Luo. 2006. FLAIM: A Multi-level Anonymization Framework for Computer and Network Logs.. In LISA, Vol. 6. 3\u20138.","journal-title":"LISA"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0218488502001648"},{"key":"e_1_2_1_44_1","volume-title":"General Data Protection Regulation. https:\/\/gdpr-info.eu\/ Accessed on","author":"Union European","year":"2024","unstructured":"European Union. 2022. General Data Protection Regulation. https:\/\/gdpr-info.eu\/ Accessed on September 11, 2024."},{"key":"e_1_2_1_45_1","volume-title":"Performance prediction for apache spark platform. In 2015 IEEE 17th International Conference on High Performance Computing and Communications","author":"Wang Kewen","year":"2015","unstructured":"Kewen Wang and Mohammad Maifi Hasan Khan. 2015. Performance prediction for apache spark platform. In 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems. IEEE, 166\u2013173."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4615-4625-2"},{"key":"e_1_2_1_47_1","volume-title":"10th IEEE International Conference on Network Protocols, 2002. Proceedings. IEEE, 280\u2013289","author":"Xu Jun","year":"2002","unstructured":"Jun Xu, Jinliang Fan, Mostafa H Ammar, and Sue B Moon. 2002. Prefix-preserving ip address anonymization: Measurement-based security evaluation and a new cryptography-based scheme. In 10th IEEE International Conference on Network Protocols, 2002. Proceedings. IEEE, 280\u2013289."},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/1736020.1736038"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISSRE59848.2023.00071"}],"container-title":["Proceedings of the ACM on Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3715779","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:17:18Z","timestamp":1750346238000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3715779"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,19]]},"references-count":49,"journal-issue":{"issue":"FSE","published-print":{"date-parts":[[2025,6,19]]}},"alternative-id":["10.1145\/3715779"],"URL":"https:\/\/doi.org\/10.1145\/3715779","relation":{},"ISSN":["2994-970X"],"issn-type":[{"value":"2994-970X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,19]]}}}