Deep Anomaly Detection under Labeling Budget Constraints

Authors: Aodong Li, Chen Qiu, Marius Kloft, Padhraic Smyth, Stephan Mandt, Maja Rudolph

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on image, tabular, and video data sets show that our approach results in stateof-the-art semi-supervised AD performance under labeling budget constraints.
Researcher Affiliation Collaboration 1Department of Computer Science, University of California, Irvine, USA 2Bosch Center for Artificial Intelligence, Pittsburgh, USA 3Department of Computer Science, TU Kaiserslautern, Germany.
Pseudocode Yes Algorithm 1 Training Procedure of SOEL
Open Source Code No The paper states 'Please also refer to our codebase for practical implementation details.' in Appendix C, implying a codebase exists. However, it does not provide a direct link (URL to a repository) or explicitly state that the code is publicly released and accessible within the paper or its supplementary materials.
Open Datasets Yes We experiment with two popular image benchmarks: CIFAR-10 and Fashion-MNIST. Our empirical study includes all 2D image datasets presented in Yang et al. (2021)... Our study includes the four multi-dimensional tabular datasets from the ODDS repository... We use UCSD Peds14, a benchmark dataset for video anomaly detection.
Dataset Splits Yes To set the learning rate, training epochs, minibatch size for Med MNIST, we find the best performing hyperparameters by evaluating the method on the validation dataset. We use the same hyperparameters on other image data. For video data and tabular data, the optimization hyperparameters are set as recommended by Qiu et al. (2022a). In order to choose τ (in Eq. (2)), we constructed a validation dataset of CIFAR-10 to select the parameter τ among {1, 1e-1, 1e-2, 1e-3} and applied the validated τ (1e-2) on all the other datasets in our experiments. Specifically, we split the original CIFAR-10 training data into a training set and a validation set.
Hardware Specification No The paper mentions 'GPU clusters on which the experiments have been performed' but does not specify any particular models (e.g., NVIDIA A100, RTX 3090), CPU types, or other detailed hardware specifications.
Software Dependencies No The paper mentions using 'Adam (Kingma and Ba, 2014)' as an optimizer and refers to 'PyOD’s implementation of k NN' and a 'Fix Match implementation' from GitHub. However, it does not provide specific version numbers for these software components or other libraries (e.g., Python 3.x, PyTorch 1.x) to ensure reproducibility.
Experiment Setup Yes Table 5 provides detailed optimization parameters for all methods, including learning rates, epochs, minibatch sizes, and a temperature parameter τ. Additionally, it states specific settings for the Adam optimizer (β1 = 0.9, β2 = 0.999, no weight decay).