Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Putting Gale & Shapley to Work: Guaranteeing Stability Through Learning

Authors: Hadi Hosseini, Sanjukta Roy, Duohan Zhang

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, our empirical results demonstrate intriguing tradeoffs between stability and optimality of the proposed algorithms, further complementing our theoretical findings. Lastly, we validate our theoretical findings using empirical simulations (Section 6).
Researcher Affiliation Academia Hadi Hosseini Penn State University, USA EMAIL Sanjukta Roy University of Leeds, UK EMAIL Duohan Zhang* Penn State University, USA EMAIL
Pseudocode Yes Algorithm 1: Uniform sampling algorithm; Algorithm 2: Arm elimination algorithm; Algorithm 3: AE arm-DA algorithm
Open Source Code No The paper does not contain an explicit statement or link in the main body to open-source code for the described methodology. While the NeurIPS checklist mentions code in supplementary material, this information is not present in the provided paper text.
Open Datasets No we consider N = K = 20 and randomly generate preferences. In particular, we follow a similar experiment setting in Liu et al. [2021]: for each i, the true utilities {µi,1, µi,2, . . . , µi,20} are randomized permutations of the sequence {1, 2, . . . , 20} so that the minimum preference gap is fixed ( = 1) and algorithm performance exhibits relatively low variability. Arms preferences are generated the same way.
Dataset Splits No The paper describes generating synthetic data for simulations but does not specify explicit training, validation, or test dataset splits.
Hardware Specification No Experiments use bandit domain and algorithms can be run on a typical personal computer. Minimal compute resources are required to reproduce experiments in the paper.
Software Dependencies No The paper does not list specific software names with version numbers used for the experiments.
Experiment Setup Yes For this, we consider N = K = 20 and randomly generate preferences. In particular, we follow a similar experiment setting in Liu et al. [2021]: for each i, the true utilities {µi,1, µi,2, . . . , µi,20} are randomized permutations of the sequence {1, 2, . . . , 20} so that the minimum preference gap is fixed ( = 1) and algorithm performance exhibits relatively low variability. Arms preferences are generated the same way. We conduct 200 independent simulations, with each simulation featuring a randomized true preference profile.