reproducibilityindex.ai

Mixture Proportion Estimation and PU Learning:A Modern Approach

Authors: Saurabh Garg, Yifan Wu, Alexander J. Smola, Sivaraman Balakrishnan, Zachary Lipton

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Both methods dominate previous approaches empirically, and for BBE, we establish formal guarantees that hold whenever we can train a model to cleanly separate out a small subset of positive examples. Our final algorithm (TED)n, alternates between the two procedures, significantly improving both our mixture proportion estimator and classifier. We conduct a battery of experiments both to empirically validate our claim that BBE s assumptions are mild and frequently hold in practice, and to establish the outperformance of BBE, CVIR, and (TED)n over the previous state of the art. We then conduct extensive experiments on semi-synthetic data, adapting a variety of binary classification datasets to the PU learning setup and demonstrating the superior performance of BBE and PU-learning with the CVIR objective.
Researcher Affiliation	Collaboration	Saurabh Garg1, Yifan Wu1, Alex Smola2, Sivaraman Balakrishnan1, Zachary C. Lipton1 1Carnegie Mellon University 2Amazon Web Services
Pseudocode	Yes	Algorithm 1 Best Bin Estimation (BBE) input : Validation positive (Xp) and unlabeled (Xu) samples. Blackbox model classiﬁer pf : X Ñ r0, 1s. Hyperparameter 0 ă δ, γ ă 1. 1: Zp, Zu fp Xpq, fp Xuq. 2: pqupzq, pqppzq zi PZp Irziězs zi PZu Irziězs nu for all z P r0, 1s. 3: Estimate pc : arg minc Pr0,1s ˆ pqupcq pqppcq 1 γ output : pα : pquppcq
Open Source Code	Yes	Code is available at https://github.com/acmi-lab/PU_learning
Open Datasets	Yes	We simulate PU tasks on CIFAR-10 [24], MNIST [25], and IMDb sentiment analysis [32] datasets. We consider binarized versions of CIFAR-10 and MNIST.
Dataset Splits	Yes	For MPE, we use a held out PU validation set. Randomly split positive and unlabeled data into training X1 p, X1 u and hold-out set (X2 p, X2 u).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running the experiments. It mentions model architectures like ResNet and BERT but not the underlying hardware.
Software Dependencies	No	The paper mentions software like TensorFlow, PyTorch, Adam (optimizer), and BERT, but it does not specify the version numbers for these software components. For example, it cites the PyTorch paper but does not explicitly state the version used in the experiments.
Experiment Setup	No	The paper states: 'We did not tune hyperparameters or the optimization algorithm instead we use the same benchmarked hyperparameters and optimization algorithm for each dataset. For our method, we use cross-entropy loss. For u PU and nn PU, we use Adam [22] with sigmoid loss.' While it mentions the loss function and optimizer, it lacks specific numerical hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations required for reproducibility.