reproducibilityindex.ai

Tri-net for Semi-Supervised Deep Learning

Authors: Dong-Dong Chen, Wei Wang, Wei Gao, Zhi-Hua Zhou

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that our method achieves the best performance in comparison with state-of-the-art semi-supervised deep learning methods.
Researcher Affiliation	Academia	Dong-Dong Chen, Wei Wang, Wei Gao, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China {chendd, wangw, gaow, zhouzh}@lamda.nju.edu.cn
Pseudocode	Yes	Algorithm 1 Tri-net Input: Labeled set L and unlabeled set U
Open Source Code	No	The paper does not provide an explicit statement or link indicating that the source code for the methodology is openly available.
Open Datasets	Yes	We run experiments on three widely used benchmark datasets, i.e., MNIST, SVHN, and CIFAR-10.
Dataset Splits	No	The paper mentions using a 'standard data split for testing' but does not explicitly specify a distinct validation dataset split with percentages or counts, or refer to a standard validation split.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications, or cloud instance types) used for running the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., 'Python 3.8', 'TensorFlow 2.0') needed to replicate the experiment.
Experiment Setup	Yes	Parameters. In order to prevent the network from overﬁtting, we gradually increase the pool size N = 1000 2t up to the size of unlabeled data U [Saito et al., 2017], where t denotes the learning round. The maximal learning round T is set to be 30 in all experiments. We gradually decrease the conﬁdence threshold σ after N = U to make more unlabeled data to be labeled (line 11, Algorithm 1)... We set σ0 = 0.999 and σos = 0.01 in MNIST; σ0 = 0.95 and σos = 0.25 in SVHN and CIFAR10. We use dropout (p = 0.5) after each max-pooling layer, use Leaky-Re LU (α = 0.1) as activate function except the FC layer, and use soft-max for FC layer. We also use Batch Normalization [Ioffe and Szegedy, 2015] for all layers except the FC layer. We use SGD with a mini-batch size of 16. The learning rate starts from 0.1 in initialization (from 0.02 in training) and is divided by 10 when the error plateaus. In initialization, three modules M1, M2 and M3 are trained for up to 300 epochs in SVHN and CIFAR-10 (100 in MNIST). In training, three modules M1, M2 and M3 are trained for up to 90 epochs in SVHN and CIFAR-10 (60 in MNIST). We set std = 0.05 in SVHN and CIFAR-10 (0.001 in MNSIT). We use a weight decay of 0.0001 and a momentum of 0.9. Following the setting in Laine and Aila [2016], we use ZCA, random crop and horizon ﬂipping for CIFAR-10, zero-mean normalization and random crop for SVHN.