Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
A Unified View of Label Shift Estimation
Authors: Saurabh Garg, Yifan Wu, Sivaraman Balakrishnan, Zachary Lipton
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on synthetic data, MNIST, and CIFAR10 support our findings. |
| Researcher Affiliation | Academia | Saurabh Garg, Yifan Wu, Sivaraman Balakrishnan, Zachary C. Lipton Machine Learning Department, Department of Statistics and Data Science, Carnegie Mellon University EMAIL |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper provides links to the publicly available code for BBSE and RLLS (baselines used for comparison), but not for the MLLS methodology primarily described in this paper. |
| Open Datasets | Yes | We validate our results on synthetic data, MNIST, and CIFAR-10. |
| Dataset Splits | Yes | With CIFAR10 and MNIST, we split the full training set into two subsets: train and valid, and use the provided test set as is. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'Res Net-18' and 'pytorch-cifar' implementation but does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | For GMM, we control the shift in the label marginal for class 1 with a fixed target sample size of 1000. For multiclass problems -MNIST and CIFAR-10, we control the Dirichlet shift parameter with a fixed sample size of 5000. For GMM, we fix the label marginal for class 1 at 0.01 whereas for multiclass problems, MNIST and CIFAR-10, we fix the Dirichlet parameter to 0.1. |