Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Unsupervised Anomaly Detection in The Presence of Missing Values

Authors: Feng Xiao, Jicong Fan

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on datasets with manually constructed missing values and inherent missing values demonstrate that our proposed method effectively mitigates the imputation bias and surpasses the baseline methods significantly.
Researcher Affiliation	Academia	1The Chinese University of Hong Kong, Shenzhen, China 2Shenzhen Research Institute of Big Data, Shenzhen, China
Pseudocode	No	The paper describes the proposed method and its implementation details in Section 3, but it does not include a dedicated pseudocode or algorithm block.
Open Source Code	Yes	The source code of our method is available at https:// github.com/jicongfan/Im AD-Anomaly-Detection-With-Missing-Data.
Open Datasets	Yes	We compare Im AD with impute-then-detect methods on 11 publicly available tabular datasets from various fields... The statistics of all datasets are in Table 1 and a detailed description of all datasets is in Appendix J.
Dataset Splits	No	In all experiments, only incomplete normal data are used in the training stage, but there are both incomplete normal and abnormal data during the inference.
Hardware Specification	Yes	ALL experiments were conducted on 20 Cores Intel(R) Xeon(R) Gold 6248 CPU with one NVIDIA Tesla V100 GPU, CUDA 12.0.
Software Dependencies	Yes	ALL experiments were conducted on 20 Cores Intel(R) Xeon(R) Gold 6248 CPU with one NVIDIA Tesla V100 GPU, CUDA 12.0.
Experiment Setup	Yes	We use MLPs to construct the three modules of Im AD, Adam [Kingma and Ba, 2015] as the optimizer and set coefficient η of entropy regularization term in Sinkhorn distance to 0.1 in all experiments. Other experimental hyper-parameters are provided in Appendix J. Sensitivity analysis of hyper-parameters is provided in Appendix I.