Learning from Label Proportions: Bootstrapping Supervised Learners via Belief Propagation
Authors: Shreyas Havaldar, Navodita Sharma, Shubhi Sareen, Karthikeyan Shanmugam, Aravindan Raghuveer
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our algorithm displays strong gains against several SOTA baselines (upto 15%) for the LLP Binary Classification problem on various dataset types tabular and Image. We achieve these improvements with minimal computational overhead above standard supervised learning due to Belief Propagation, for large bag sizes, even for a million samples. We perform extensive experimentation on four datasets. |
| Researcher Affiliation | Industry | Shreyas Havaldar1 Navodita Sharma1 Shubhi Sareen2 Karthikeyan Shanmugam1 Aravindan Raghuveer1 1Google Research India 2Google India |
| Pseudocode | Yes | Algorithm 1 Iterative Embedding Refinement with BP and Aggregate Embedding |
| Open Source Code | No | We will soon publicly release the source code. |
| Open Datasets | Yes | We perform extensive experimentation on four datasets. 1. Adult Income (Dua & Graff, 2017) (Kohavi et al., 1996): 2. Bank Marketing (Dua & Graff, 2017) (Moro et al., 2011): 3. Criteo (Krizhevsky, 2009): 4. CIFAR-10 (Jean-Baptiste Tien, 2014) |
| Dataset Splits | Yes | The dataset is split 90-10 as train-test and 10% of train is used as a hold out validation following Yoon et al. (2020)... We use 10% of the data from the Train Set as Validation Set to tune our hyperparameters. |
| Hardware Specification | Yes | All experiments were performed on a single NVIDIA V100 GPU. For smaller datasets like Adult with 50k samples, even on bag size as large as 2048, the BP stage takes only 1054s on a NVIDIA P100 GPU |
| Software Dependencies | No | We use PGMax package (Zhou et al., 2022) implemented in JAX (Bradbury et al., 2018) where we just need to specify the potentials Ji,j and hi. The paper mentions software by name but does not provide specific version numbers for PGMax or JAX. |
| Experiment Setup | Yes | We optimize our algorithm, using hyperparameters λs, λb [10-4, 200], k [1, 30], T = [50, 100, 200], τ (0, 1), MLPLR [10-6, 1], MLPW D [10-12, 10-1], λa [0, 10], δd [10-4, 1], Batch Sizetrain = [2, 4, 8, . . . 4096, 8192], tuned using Vizier (Song et al., 2022) to achieve the best Validation AUC score. Illustrative values of best hyperparameters for various experiments are given in appendix section A.6. |