Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Unsupervised Anomaly Detection in The Presence of Missing Values
Authors: Feng Xiao, Jicong Fan
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on datasets with manually constructed missing values and inherent missing values demonstrate that our proposed method effectively mitigates the imputation bias and surpasses the baseline methods significantly. |
| Researcher Affiliation | Academia | 1The Chinese University of Hong Kong, Shenzhen, China 2Shenzhen Research Institute of Big Data, Shenzhen, China |
| Pseudocode | No | The paper describes the proposed method and its implementation details in Section 3, but it does not include a dedicated pseudocode or algorithm block. |
| Open Source Code | Yes | The source code of our method is available at https:// github.com/jicongfan/Im AD-Anomaly-Detection-With-Missing-Data. |
| Open Datasets | Yes | We compare Im AD with impute-then-detect methods on 11 publicly available tabular datasets from various fields... The statistics of all datasets are in Table 1 and a detailed description of all datasets is in Appendix J. |
| Dataset Splits | No | In all experiments, only incomplete normal data are used in the training stage, but there are both incomplete normal and abnormal data during the inference. |
| Hardware Specification | Yes | ALL experiments were conducted on 20 Cores Intel(R) Xeon(R) Gold 6248 CPU with one NVIDIA Tesla V100 GPU, CUDA 12.0. |
| Software Dependencies | Yes | ALL experiments were conducted on 20 Cores Intel(R) Xeon(R) Gold 6248 CPU with one NVIDIA Tesla V100 GPU, CUDA 12.0. |
| Experiment Setup | Yes | We use MLPs to construct the three modules of Im AD, Adam [Kingma and Ba, 2015] as the optimizer and set coefficient η of entropy regularization term in Sinkhorn distance to 0.1 in all experiments. Other experimental hyper-parameters are provided in Appendix J. Sensitivity analysis of hyper-parameters is provided in Appendix I. |