Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Putting Gale & Shapley to Work: Guaranteeing Stability Through Learning
Authors: Hadi Hosseini, Sanjukta Roy, Duohan Zhang
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, our empirical results demonstrate intriguing tradeoffs between stability and optimality of the proposed algorithms, further complementing our theoretical findings. Lastly, we validate our theoretical findings using empirical simulations (Section 6). |
| Researcher Affiliation | Academia | Hadi Hosseini Penn State University, USA EMAIL Sanjukta Roy University of Leeds, UK EMAIL Duohan Zhang* Penn State University, USA EMAIL |
| Pseudocode | Yes | Algorithm 1: Uniform sampling algorithm; Algorithm 2: Arm elimination algorithm; Algorithm 3: AE arm-DA algorithm |
| Open Source Code | No | The paper does not contain an explicit statement or link in the main body to open-source code for the described methodology. While the NeurIPS checklist mentions code in supplementary material, this information is not present in the provided paper text. |
| Open Datasets | No | we consider N = K = 20 and randomly generate preferences. In particular, we follow a similar experiment setting in Liu et al. [2021]: for each i, the true utilities {µi,1, µi,2, . . . , µi,20} are randomized permutations of the sequence {1, 2, . . . , 20} so that the minimum preference gap is fixed ( = 1) and algorithm performance exhibits relatively low variability. Arms preferences are generated the same way. |
| Dataset Splits | No | The paper describes generating synthetic data for simulations but does not specify explicit training, validation, or test dataset splits. |
| Hardware Specification | No | Experiments use bandit domain and algorithms can be run on a typical personal computer. Minimal compute resources are required to reproduce experiments in the paper. |
| Software Dependencies | No | The paper does not list specific software names with version numbers used for the experiments. |
| Experiment Setup | Yes | For this, we consider N = K = 20 and randomly generate preferences. In particular, we follow a similar experiment setting in Liu et al. [2021]: for each i, the true utilities {µi,1, µi,2, . . . , µi,20} are randomized permutations of the sequence {1, 2, . . . , 20} so that the minimum preference gap is fixed ( = 1) and algorithm performance exhibits relatively low variability. Arms preferences are generated the same way. We conduct 200 independent simulations, with each simulation featuring a randomized true preference profile. |