{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,16]],"date-time":"2026-02-16T17:56:18Z","timestamp":1771264578119,"version":"3.50.1"},"reference-count":23,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2025,7,2]],"date-time":"2025-07-02T00:00:00Z","timestamp":1751414400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2020YFB1807500"],"award-info":[{"award-number":["2020YFB1807500"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Traditional detection methods and security defenses are gradually insufficient to cope with evolving attack techniques and strategies, and have coarse detection granularity and high memory overhead. As a result, we propose Sylph, a lightweight unsupervised APT detection method based on a provenance graph, which not only detects APT attacks but also localizes APT attacks with a fine event granularity and feeds possible attacks back to system detectors to reduce their localization burden. Sylph proposes a whole-process architecture from provenance graph collection to anomaly detection, starting from the system audit logs, and dividing subgraphs based on time slices of the provenance graph it transforms into to reduce memory overhead. Starting from the system audit logs, the provenance graph it transforms into is divided into subgraphs based on time slices, which reduces the memory occupation and improves the detection efficiency at the same time; on the basis of generating the sequence of subgraphs, the full graph embedding of the subgraphs is carried out by using Graph2Vec to obtain their feature vectors, and the anomaly detection based on unsupervised learning is carried out by using an autoencoder, which is capable of detecting new types of attacks that have not yet appeared. After the experimental evaluation, Sylph can realize the APT attack detection with higher accuracy and achieve an accuracy rate.<\/jats:p>","DOI":"10.3390\/info16070566","type":"journal-article","created":{"date-parts":[[2025,7,2]],"date-time":"2025-07-02T04:11:04Z","timestamp":1751429464000},"page":"566","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Sylph: An Unsupervised APT Detection System Based on the Provenance Graph"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0065-519X","authenticated-orcid":false,"given":"Kaida","family":"Jiang","sequence":"first","affiliation":[{"name":"Network and Information Center, Shanghai Jiao Tong University, Shanghai 200240, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-5377-7945","authenticated-orcid":false,"given":"Zihan","family":"Gao","sequence":"additional","affiliation":[{"name":"School of Computer Science, Shanghai Jiao Tong University, Shanghai 200240, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0502-8227","authenticated-orcid":false,"given":"Siyu","family":"Zhang","sequence":"additional","affiliation":[{"name":"Network and Information Center, Shanghai Jiao Tong University, Shanghai 200240, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5898-7317","authenticated-orcid":false,"given":"Futai","family":"Zou","sequence":"additional","affiliation":[{"name":"School of Computer Science, Shanghai Jiao Tong University, Shanghai 200240, China"}]}],"member":"1968","published-online":{"date-parts":[[2025,7,2]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Ashraf, M.W.A., Singh, A.R., Pandian, A., Rathore, R.S., Bajaj, M., and Zaitsev, I. (2024). A hybrid approach using support vector machine rule-based system: Detecting cyber threats in internet of things. Sci. Rep., 14.","DOI":"10.1038\/s41598-024-78976-1"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Hossain, M.N., Sheikhi, S., and Sekar, R. (2020, January 18\u201320). Combating dependence explosion in forensic analysis using alternative tag propagation semantics. Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA.","DOI":"10.1109\/SP40000.2020.00064"},{"key":"ref_3","unstructured":"Rehman, M.U., Ahmadi, H., and Hassan, W.U. (2024, January 19\u201323). Flash: A comprehensive approach to intrusion detection via provenance graph representation learning. Proceedings of the 2024 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"102282","DOI":"10.1016\/j.cose.2021.102282","article-title":"Threat detection and investigation with system-level provenance graphs: A survey","volume":"106","author":"Li","year":"2021","journal-title":"Comput. Secur."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1007\/s10207-023-00742-7","article-title":"A review on graph-based approaches for network security monitoring and botnet detection","volume":"23","author":"Lagraa","year":"2024","journal-title":"Int. J. Inf. Secur."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Manzoor, E., Milajerdi, S.M., and Akoglu, L. (2016, January 13\u201317). Fast memory-efficient anomaly detection in streaming heterogeneous graphs. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939783"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"100067","DOI":"10.1016\/j.array.2021.100067","article-title":"Advanced persistent threat attack detection method in cloud computing based on autoencoder and softmax regression algorithm","volume":"10","author":"Abdullayeva","year":"2021","journal-title":"Array"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Milajerdi, S.M., Eshete, B., Gjomemo, R., and Venkatakrishnan, V.N. (2019, January 11\u201315). Poirot: Aligning attack behavior with kernel audit records for cyber threat hunting. Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, London, UK.","DOI":"10.1145\/3319535.3363217"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Milajerdi, S.M., Gjomemo, R., Eshete, B., Sekar, R., and Venkatakrishnan, V.N. (2019, January 19\u201323). Holmes: Real-time apt detection through correlation of suspicious information flows. Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA.","DOI":"10.1109\/SP.2019.00026"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Han, X., Pasquier, T., Bates, A., Mickens, J., and Seltzer, M. (2020). Unicorn: Runtime provenance-based detector for advanced persistent threats. arXiv.","DOI":"10.14722\/ndss.2020.24046"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Wang, Q., Hassan, W.U., Li, D., Jee, K., Yu, X., Zou, K., Rhee, J., Chen, Z., Cheng, W., and Gunter, C.A. (2020, January 23\u201326). You Are What You Do: Hunting Stealthy Malware via Data Provenance Analysis. Proceedings of the Network and Distributed Systems Security (NDSS) Symposium 2020, San Diego, CA, USA.","DOI":"10.14722\/ndss.2020.24167"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Hassan, W.U., Bates, A., and Marino, D. (2020, January 18\u201320). Tactical provenance analysis for endpoint detection and response systems. Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA.","DOI":"10.1109\/SP40000.2020.00096"},{"key":"ref_13","unstructured":"Alsaheel, A., Nan, Y., Ma, S., Yu, L., Walkup, G., Celik, Z.B., Zhang, X., and Xu, D. (2021, January 11\u201313). ATLAS: A sequence-based learning approach for attack investigation. Proceedings of the 30th USENIX Security Symposium, USENIX Security 2021, Vancouver, BC, Canada."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Perozzi, B., Al-Rfou, R., and Skiena, S. (2014, January 24\u201327). Deepwalk: Online learning of social representations. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA.","DOI":"10.1145\/2623330.2623732"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Grover, A., and Leskovec, J. (2016, January 13\u201317). node2vec: Scalable feature learning for networks. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939754"},{"key":"ref_16","unstructured":"Narayanan, A., Chandramohan, M., Venkatesan, R., Chen, L., Liu, Y., and Jaiswal, S. (2017). graph2vec: Learning distributed representations of graphs. arXiv."},{"key":"ref_17","first-page":"5165","article-title":"Link prediction based on graph neural networks","volume":"31","author":"Zhang","year":"2018","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_18","unstructured":"Gehani, A., Kazmi, H., and Irshad, H. (2016, January 8\u20139). Scaling SPADE to \u201cBig provenance\u201d. Proceedings of the 8th USENIX Conference on Theory and Practice of Provenance, Washington, DC, USA."},{"key":"ref_19","unstructured":"Gehani, A., and Tariq, D. (2013, January 9\u201313). SPADE: Support for provenance auditing in distributed environments. Proceedings of the ACM\/IFIP\/USENIX International Conference on Distributed Systems Platforms and Open Distributed Processing, Beijing, China."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1145\/3475358","article-title":"Digging into big provenance (with SPADE)","volume":"64","author":"Gehani","year":"2021","journal-title":"Commun. ACM"},{"key":"ref_21","unstructured":"Varoquaux, G., Vaught, T., and Millman, J. (2008, January 19\u201324). Exploring network structure, dynamics, and function using NetworkX. Proceedings of the 7th Python in Science Conference (SciPy2008), Pasadena, CA, USA."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Chen, Z., Yeo, C.K., Lee, B.S., and Lau, C.T. (2018, January 17\u201320). Autoencoder-based network anomaly detection. Proceedings of the 2018 Wireless Telecommunications Symposium (WTS), Phoenix, AZ, USA.","DOI":"10.1109\/WTS.2018.8363930"},{"key":"ref_23","unstructured":"Han, X.M. (2025, January 01). Streamspot Data [Data Set]. GitHub. Available online: https:\/\/github.com\/sbustreamspot\/sbustreamspot-data."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/7\/566\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:02:47Z","timestamp":1760032967000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/7\/566"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,2]]},"references-count":23,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2025,7]]}},"alternative-id":["info16070566"],"URL":"https:\/\/doi.org\/10.3390\/info16070566","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,7,2]]}}}