DROCC: Deep Robust One-Class Classification

Authors: Sachin Goyal, Aditi Raghunathan, Moksh Jain, Harsha Vardhan Simhadri, Prateek Jain

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluation demonstrates that DROCC is highly effective in two different one-class problem settings and on a range of real-world datasets across different domains: tabular data, images (CIFAR and Image Net), audio, and time-series, offering up to 20% increase in accuracy over the state-of-the-art in anomaly detection.
Researcher Affiliation Collaboration 1Microsoft Research India 2Stanford University 3NITK Surathkal.
Pseudocode Yes Algorithm 1 Training neural networks via DROCC
Open Source Code Yes Code is available at https: //github.com/microsoft/Edge ML.
Open Datasets Yes CIFAR-10 (Krizhevsky, 2009): Widely used benchmark for anomaly detection, 10 classes with 32 32 images. Image Net-10: a subset of 10 randomly chosen classes from the Image Net dataset (Deng et al., 2009) which contains 224 224 color images. Abalone (Dua & Graff, 2017): Physical measurements of abalone are provided and the task is to predict the age. Arrthythmia (Rayana, 2016): Features derived from ECG and the task is to identify arrhythmic samples. Thyroid (Rayana, 2016): Determine whether a patient referred to the clinic is hypothyroid based on patient s medical data. Epileptic Seizure Recognition (Andrzejak et al., 2001): EEG based time-series dataset from multiple patients. Audio Commands (Warden, 2018): A multiclass data with 35 classes of audio keywords.
Dataset Splits Yes We use the train-test splits when already available with a 80-20 split for train and validation set. In all other cases, we use random 60-20-20 split for train, validation, and test.
Hardware Specification Yes The experiments were run on an Intel Xeon CPU with 12 cores clocked at 2.60 GHz and with NVIDIA Tesla P40 GPU, CUDA 10.2, and cu DNN 7.6.
Software Dependencies Yes The experiments were run on an Intel Xeon CPU with 12 cores clocked at 2.60 GHz and with NVIDIA Tesla P40 GPU, CUDA 10.2, and cu DNN 7.6.
Experiment Setup Yes The main hyper-parameter of our algorithm is the radius r which defines the set Ni(r). We observe that tweaking radius value around d/2 (where d is the dimension of the input data ) works the best... We fix γ as 2 in our experiments unless specified otherwise. Parameter µ (1) is chosen from {0.5 , 1.0}. We use a standard step size from {0.1, 0.01} for gradient ascent and from {10 2, 10 4} for gradient descent; we also tune the optimizer {Adam, SGD}.