Learning from Label Proportions: A Mutual Contamination Framework
Authors: Clayton Scott, Jianxin Zhang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments |
| Researcher Affiliation | Academia | Clayton Scott and Jianxin Zhang Electrical Engineering and Computer Science University of Michigan Ann Arbor, MI 48109 {clayscot,jianxinz}@umich.edu |
| Pseudocode | Yes | Algorithm 1 Plug-in approach to LLP via LMMCM (outline) |
| Open Source Code | Yes | 2https://github.com/Z-Jianxin/Learning-from-Label-Proportions-A-Mutual-Contamination-Framework |
| Open Datasets | Yes | We consider the Adult (T = 8192) and MAGIC Gamma Ray Telescope (T = 6144) datasets (both available from the UCI repository3) |
| Dataset Splits | Yes | the parameter λ {1, 10 1, 10 2, . . . , 10 5} is chosen by 5-fold cross validation. |
| Hardware Specification | No | For each dataset, our implementation runs all 8 settings in roughly 50 minutes using 48 cores. |
| Software Dependencies | No | Our Python implementation uses Sci Py s L-BFGS routine to find the optimal αi. |
| Experiment Setup | Yes | We implement a method based on our general approach (see Algorithm 1) by taking ℓto be the logistic loss, F to be the RKHS associated to a Gaussian kernel k, and selecting f F by minimizing b Ew(f) + λ f 2 F. ... The kernel parameter is computed by 1 d V ar(X) where d is the number of features and V ar(X) is the variance of the data matrix, and the parameter λ {1, 10 1, 10 2, . . . , 10 5} is chosen by 5-fold cross validation. |