Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
OptIForest: Optimal Isolation Forest for Anomaly Detection
Authors: Haolong Xiang, Xuyun Zhang, Hongsheng Hu, Lianyong Qi, Wanchun Dou, Mark Dras, Amin Beheshti, Xiaolong Xu
IJCAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on a series of benchmarking datasets for comparative and ablation studies demonstrate that our approach can efficiently and robustly achieve better detection performance in general than the state-of-the-arts including the deep learning based methods. |
| Researcher Affiliation | Academia | 1Macquarie University 2CSIRO s Data61 3Qufu Normal University 4Nanjing University 5Nanjing University of Information Science and Technology |
| Pseudocode | Yes | Algorithm 1 Constructing an Optimal Isolation Tree |
| Open Source Code | Yes | The source code is available at https://github.com/xiagll/Opt IForest. |
| Open Datasets | Yes | We evaluate all methods on 20 widely-used benchmark datasets [Pang et al., 2019; Han et al., 2022; Li et al., 2022]. |
| Dataset Splits | No | The paper mentions using 'sampling size' in ablation studies but does not provide specific details on train/validation/test splits, percentages, or cross-validation setup for reproducibility. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory, or cloud instances). |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies used in the experiments (e.g., Python version, library versions). |
| Experiment Setup | No | The paper mentions 'optimal parameter settings of the baseline methods' and discusses 'cut threshold' and 'sampling size' in ablation studies, but it does not provide comprehensive details on hyperparameters or other system-level training settings for its own method. |