Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Interpreting Unsupervised Anomaly Detection in Security via Rule Extraction
Authors: Ruoyu Li, Qing Li, Yu Zhang, Dan Zhao, Yong Jiang, Yong Yang
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct comprehensive experiments on the explanation of four distinct unsupervised anomaly detection models on various real-world datasets. The evaluation shows that our method outperforms existing methods in terms of diverse metrics including fidelity, correctness and robustness. |
| Researcher Affiliation | Collaboration | Tsinghua University, China; Peng Cheng Laboratory, China Tsinghua Shenzhen International Graduate School, China Tencent Security Platform Department, China |
| Pseudocode | Yes | Algorithm 1: Compositional Boundary Exploration |
| Open Source Code | Yes | Our code is available at https://github.com/Ruoyu-Li/UAD-Rule-Extraction. |
| Open Datasets | Yes | We employ three benchmark datasets for network intrusion detection in the experiment, including CIC-IDS2017, CSE-CIC-IDS2018 [49] and TON-IoT [50]. |
| Dataset Splits | Yes | The datasets are randomly split by the ratio of 6:2:2 for training, validation and testing. |
| Hardware Specification | Yes | Our experiments were conducted on a server equipped with the Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz (128GB RAM) and the GeForce RTX 2080 Super (8GB VRAM). |
| Software Dependencies | Yes | Our implementation is primarily based on PyTorch (version 1.12.1)... we employ the versatile machine learning library scikit-learn (version 1.1.3). Python (version 3.9.15) serves as the programming language... |
| Experiment Setup | Yes | We present four major hyperparameters in Figure 2, including the maximum depth τ of an IC-Tree, Ne number of explorers, the coefficient ρ of sampling, and the factor η that controls the stride of an iteration." and "We find that τ = 15 achieves the best performance." and "Figure 2b shows that a value between 6 and 8 is recommended. |