reproducibilityindex.ai

Matching in Multi-arm Bandit with Collision

Authors: YiRui Zhang, Siwei Wang, Zhixuan Fang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Result. Figure 1 shows the average regret and the standard deviation of regret over 50 independent runs. From Figure 1, ML-ETC outperforms both Phased-ETC and CA-UCB from an asymptotic view
Researcher Affiliation	Collaboration	Yirui Zhang1, Siwei Wang2, Zhixuan Fang1,3 1 IIIS, Tsinghua University 2 Microsoft Research 3 Shanghai Qi Zhi Institute
Pseudocode	Yes	Algorithm 1 ML-ETC Algorithm
Open Source Code	Yes	3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See supplemental material
Open Datasets	No	We choose the time horizon to be T = 2.5 107, arms mean utilities within [0.3, 0.6], and the minimal gap = 0.05. We have tested two cases with 5 agents and 5 arms but different preference and utility. To investigate the quality of the converging stable matching under different algorithms, we choose the arm preferences such that there exist multiple stable matches between agents and arms (see Appendix for the implementation detail). The paper uses synthetically generated data based on specified parameters and does not provide access information for a public dataset.
Dataset Splits	No	The paper describes simulation experiments but does not provide specific training/test/validation dataset splits.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup	Yes	Setup. We choose the time horizon to be T = 2.5 107, arms mean utilities within [0.3, 0.6], and the minimal gap = 0.05. We have tested two cases with 5 agents and 5 arms but different preference and utility... When ϵ is smaller, the duration of each exploration is shorter. Same as the simulation in [2], we choose ϵ = 0.2 in our simulation... Same as the simulation in [10], we choose the parameter λ of delay probability to be λ = 0.1.