Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learning from Positive and Unlabeled Data with a Selection Bias
Authors: Masahiro Kato, Takeshi Teshima, Junya Honda
ICLR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments, we show that the method outperforms previous methods for PU learning on various real-world datasets. |
| Researcher Affiliation | Academia | Masahiro Kato1,2, Takeshi Teshima1,2, and Junya Honda1,2 1The University of Tokyo, Tokyo, Japan 2RIKEN, Tokyo, Japan |
| Pseudocode | Yes | Algorithm 1 Conceptual Algorithm in Population; Algorithm 2 PUSB |
| Open Source Code | Yes | The source code is available at https://github.com/Masa Kat0/PUlearning. |
| Open Datasets | Yes | We used seven classification datasets, mushrooms, shuttle, pageblocks, usps, connect-4, spambase, and MNIST, from UCI repository), CIFAR-10 and a document dataset obtained from Swiss Prot... The UCI data were downloaded from https://archive.ics.uci.edu/ml/index.php and https://www.csie.ntu.edu.cn/~cjlin/libsvmtools/. See https://www.cs.toronto.edu/~kriz/cifar.html. The data can be downloaded from http://www.cs.ucsd.edu/users/elkan/posonly. |
| Dataset Splits | Yes | For the linear models, hyperparameters were selected via cross-validation. |
| Hardware Specification | No | The paper does not specify any particular hardware components such as GPU models, CPU models, or memory details used for the experiments. It only mentions 'deep neural networks' in general. |
| Software Dependencies | No | The paper mentions using 'logistic regression', 'deep neural networks', 'Re LU activation', and 'Batch normalization', but it does not specify any software frameworks (e.g., TensorFlow, PyTorch) or their version numbers, nor other library dependencies with versions. |
| Experiment Setup | Yes | For the linear models, hyperparameters were selected via cross-validation. For MNIST, a 3-layer multilayer perceptron (MLP) with Re LU activation (Nair & Hinton, 2010) was used. For CIFAR-10, an all convolutional net (Springenberg et al., 2015) was used. Batch normalization (Ioffe & Szegedy, 2015) was applied before hidden layers. The model for this dataset was a 5-layer multilayer perceptron (MLP) with Re LU (more specifically, 78894-300-300-300-300-1). For the regularization term R, we used the ℓ2 norm of the parameters scaled by a positive scalar λ. |