reproducibilityindex.ai

Semi-Supervised Learning for Maximizing the Partial AUC

Authors: Tomoharu Iwata, Akinori Fujino, Naonori Ueda4239-4246

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	With experiments using various datasets, we demonstrate that the proposed method achieves higher test p AUCs than existing methods.
Researcher Affiliation	Industry	Tomoharu Iwata, Akinori Fujino, Naonori Ueda NTT Communication Science Laboratories, Kyoto, Japan {tomoharu.iwata.gy, akinori.fujino.yh, naonori.ueda.fr}@hco.ntt.co.jp
Pseudocode	No	The paper describes methods and derivations using mathematical equations and textual explanations, but it does not include any formal pseudocode blocks or algorithm listings.
Open Source Code	No	The paper does not provide any explicit statement about releasing the source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	We evaluated the effectiveness of the proposed method by using the following nine datasets for anomaly detection (Campos et al. 2016) 1: Annthyroid, Cardiotocography, Internet Ads, KDDCup99, Page Blocks, Pima, Spam Base, Waveform and Wilt, where the feature vector dimensionalities were 21, 21, 1555, 79, 10, 8, 57, 21 and 5, respectively. For each dataset, we used 50 labeled and 300 unlabeled samples for training, 50 labeled samples for validation, and the remaining samples for testing, where the positive ratio was set at 0.1. For each dataset, we randomly sampled 30 training, validation and test data sets, and calculated the average p AUC over the 30 sets. 1The datasets were obtained from http://www.dbs.iﬁ.lmu.de/ research/outlier-evaluation/DAMI/.
Dataset Splits	Yes	For each dataset, we used 50 labeled and 300 unlabeled samples for training, 50 labeled samples for validation, and the remaining samples for testing, where the positive ratio was set at 0.1.
Hardware Specification	Yes	The average computational time for training with the proposed method on the Internet Ads dataset, which had the largest feature vector dimensionality and took the longest time among the nine datasets, was 29.3 seconds on computers with 2.60GHz CPUs.
Software Dependencies	No	The paper states: 'We implemented all the methods based on Py Torch (Paszke et al. 2017).' While PyTorch is named, a specific version number for PyTorch itself or any other software dependencies is not provided, making it not fully reproducible in terms of software versions.
Experiment Setup	Yes	For decision functions s(x) with all the methods including our proposed method, we used the same neural network architecture, which was a three-layer feed-forward neural network with 32 hidden units and rectiﬁed linear units (Re LU) for the activation functions. We optimized the neural network parameters using ADAM (Kingma and Ba 2015) with a learning rate of 0.1 and a batch size of 1,024. The empirical AUC and p AUC were calculated using samples in each batch for training. The weight decay parameter was set at 10 3. The hyperparameters of the proposed, SS, SSR, p SS and p SSR methods were selected from 0,0.2,0.4,0.6,0.8 and 1 using the validation p AUC. With the ST method, the number of unlabeled samples to be labeled for each step is tuned from {5, 10, 15, 20, 25} using the validation p AUC, where the number of epochs for each retraining step was 100. The validation p AUC was also used for early stopping with all methods, where the maximum number of training epochs was 3,000.