Learning from Label Proportions: Bootstrapping Supervised Learners via Belief Propagation

Authors: Shreyas Havaldar, Navodita Sharma, Shubhi Sareen, Karthikeyan Shanmugam, Aravindan Raghuveer

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our algorithm displays strong gains against several SOTA baselines (upto 15%) for the LLP Binary Classification problem on various dataset types tabular and Image. We achieve these improvements with minimal computational overhead above standard supervised learning due to Belief Propagation, for large bag sizes, even for a million samples. We perform extensive experimentation on four datasets.
Researcher Affiliation Industry Shreyas Havaldar1 Navodita Sharma1 Shubhi Sareen2 Karthikeyan Shanmugam1 Aravindan Raghuveer1 1Google Research India 2Google India
Pseudocode Yes Algorithm 1 Iterative Embedding Refinement with BP and Aggregate Embedding
Open Source Code No We will soon publicly release the source code.
Open Datasets Yes We perform extensive experimentation on four datasets. 1. Adult Income (Dua & Graff, 2017) (Kohavi et al., 1996): 2. Bank Marketing (Dua & Graff, 2017) (Moro et al., 2011): 3. Criteo (Krizhevsky, 2009): 4. CIFAR-10 (Jean-Baptiste Tien, 2014)
Dataset Splits Yes The dataset is split 90-10 as train-test and 10% of train is used as a hold out validation following Yoon et al. (2020)... We use 10% of the data from the Train Set as Validation Set to tune our hyperparameters.
Hardware Specification Yes All experiments were performed on a single NVIDIA V100 GPU. For smaller datasets like Adult with 50k samples, even on bag size as large as 2048, the BP stage takes only 1054s on a NVIDIA P100 GPU
Software Dependencies No We use PGMax package (Zhou et al., 2022) implemented in JAX (Bradbury et al., 2018) where we just need to specify the potentials Ji,j and hi. The paper mentions software by name but does not provide specific version numbers for PGMax or JAX.
Experiment Setup Yes We optimize our algorithm, using hyperparameters λs, λb [10-4, 200], k [1, 30], T = [50, 100, 200], τ (0, 1), MLPLR [10-6, 1], MLPW D [10-12, 10-1], λa [0, 10], δd [10-4, 1], Batch Sizetrain = [2, 4, 8, . . . 4096, 8192], tuned using Vizier (Song et al., 2022) to achieve the best Validation AUC score. Illustrative values of best hyperparameters for various experiments are given in appendix section A.6.