reproducibilityindex.ai

Universum Prescription: Regularization Using Unlabeled Data

Authors: Xiang Zhang, Yann LeCun

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This paper shows that simply prescribing none of the above labels to unlabeled data has a beneﬁcial regularization effect to supervised learning. We call it universum prescription by the fact that the prescribed labels cannot be one of the supervised labels. In spite of its simplicity, universum prescription obtained competitive results in training deep convolutional networks for CIFAR-10, CIFAR-100, STL-10 and Image Net datasets.
Researcher Affiliation	Academia	Xiang Zhang, Yann Le Cun Courant Institute of Mathematical Sciences, New York University 719 Broadway, 12th Floor, New York, NY 10003 {xiang, yann}@cs.nyu.edu
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. The methods are described in narrative text.
Open Source Code	No	The paper does not provide any concrete access to source code (e.g., a specific repository link or an explicit statement about code release in supplementary materials) for the methodology described.
Open Datasets	Yes	Experiments on Image Classiﬁcation In this section we test the methods on some image classiﬁcation tasks. Three series of datasets CIFAR-10/100 (Krizhevsky 2009), STL-10 (Coates, Ng, and Lee 2011) and Image Net (Russakovsky et al. 2015) are chosen due to the availability of unlabeled data.
Dataset Splits	Yes	The Image Net dataset (Russakovsky et al. 2015) for classiﬁcation task has in total 1,281,167 training images and 50,000 validation images. The reported testing errors are evaluated on this validation dataset.
Hardware Specification	Yes	We gratefully acknowledge NVIDIA Corporation with the donation of 2 Tesla K40 GPUs used for this research.
Software Dependencies	No	The paper mentions the use of deep learning frameworks and algorithms but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or specific libraries like scikit-learn with versions).
Experiment Setup	Yes	The algorithm used is stochastic gradient descent with momentum (Polyak 1964) (Sutskever et al. 2013) 0.9 and a minibatch size of 32. The initial learning rate is 0.005 which is halved every 60,000 minibatch steps for CIFAR-10/100 and every 600,000 minibatch steps for Image Net. The training stops at 400,000 steps for CIFAR-10/100 and STL10, and 2,500,000 steps for Image Net. Two dropout (Srivastava et al. 2014) layers of probability 0.5 are inserted before the ﬁnal two linear layers.