reproducibilityindex.ai

Learning from Label Proportions: A Mutual Contamination Framework

Authors: Clayton Scott, Jianxin Zhang

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments
Researcher Affiliation	Academia	Clayton Scott and Jianxin Zhang Electrical Engineering and Computer Science University of Michigan Ann Arbor, MI 48109 {clayscot,jianxinz}@umich.edu
Pseudocode	Yes	Algorithm 1 Plug-in approach to LLP via LMMCM (outline)
Open Source Code	Yes	2https://github.com/Z-Jianxin/Learning-from-Label-Proportions-A-Mutual-Contamination-Framework
Open Datasets	Yes	We consider the Adult (T = 8192) and MAGIC Gamma Ray Telescope (T = 6144) datasets (both available from the UCI repository3)
Dataset Splits	Yes	the parameter λ {1, 10 1, 10 2, . . . , 10 5} is chosen by 5-fold cross validation.
Hardware Specification	No	For each dataset, our implementation runs all 8 settings in roughly 50 minutes using 48 cores.
Software Dependencies	No	Our Python implementation uses Sci Py s L-BFGS routine to ﬁnd the optimal αi.
Experiment Setup	Yes	We implement a method based on our general approach (see Algorithm 1) by taking ℓto be the logistic loss, F to be the RKHS associated to a Gaussian kernel k, and selecting f F by minimizing b Ew(f) + λ f 2 F. ... The kernel parameter is computed by 1 d V ar(X) where d is the number of features and V ar(X) is the variance of the data matrix, and the parameter λ {1, 10 1, 10 2, . . . , 10 5} is chosen by 5-fold cross validation.