reproducibilityindex.ai

Binary Classification from Positive-Confidence Data

Authors: Takashi Ishida, Gang Niu, Masashi Sugiyama

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we numerically illustrate the behavior of the proposed method on synthetic datasets for linear models. We further demonstrate the usefulness of the proposed method on benchmark datasets for deep neural networks that are highly nonlinear models. The results in Table 3 and Table 4 show that in most cases, Pconf classiﬁcation either outperforms or is comparable to the weighted classiﬁcation baseline, outperforms Auto-Encoder, and is even comparable to the fully-supervised method in some cases.
Researcher Affiliation	Academia	Takashi Ishida1,2 Gang Niu2 Masashi Sugiyama2,1 1 The University of Tokyo, Tokyo, Japan 2 RIKEN, Tokyo, Japan
Pseudocode	No	The paper describes the proposed method mathematically and conceptually but does not provide structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code will be available on http://github.com/takashiishida/pconf.
Open Datasets	Yes	Fashion-MNIST: The Fashion-MNIST dataset consists of 70,000 examples where each sample is a 28 28 gray-scale image (input dimension is 784), associated with a label from 10 fashion item classes. CIFAR-10: The CIFAR-10 dataset consists of 10 classes, with 5,000 images in each class. Each image is given in a 32 32 3 format.
Dataset Splits	No	The paper states that the dataset was "divided into four sub-datasets: a training set, a validation set, a test set, and a dataset for learning a probabilistic classiﬁer to estimate positive-conﬁdence" for benchmark experiments, but does not provide specific percentages or sample counts for these splits. For synthetic data, it specifies "500 positive data and 500 negative data were generated independently from each distribution for training" and "1,000 positive and 1,000 negative data were generated for testing," but does not mention a validation set.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory, or specific computing environments used for running the experiments.
Software Dependencies	No	The implementation is based on Py Torch [37], Sklearn [38], and mpmath [20]. mpmath: a Python library for arbitrary-precision ﬂoating-point arithmetic (version 0.18), December 2013. The versions for PyTorch and Sklearn are not specified.
Experiment Setup	Yes	vanilla gradient descent with 5, 000 epochs (full-batch size) and learning rate 0.001 was used for optimization. dropout [47] with rate 50% after each fully-connected layer, and early-stopping with 20 epochs. weight decay candidates were chosen from {10 7, 10 4, 10 1}. Adam [22] was again used for optimization with 200 epochs and mini-batch size 100.