Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Bandit Learning in Decentralized Matching Markets
Authors: Lydia T. Liu, Feng Ruan, Horia Mania, Michael I. Jordan
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our theoretical guarantee on the performance of CA-UCB exhibits an exponential dependence on the size of the market. In Section 7 we show empirically that this dependence is an artifact of our analysis; CA-UCB performs much better in practice than these results suggest. In this section, through empirical evaluations we show that the true performance of our proposed method is likely better than our guarantee suggests for markets with randomly drawn preferences. More precisely, we perform two sets of simulations. [...] For all experiments we use Algorithm 1 with delay probability λ = 0.1. [...] As can be seen in Figure 1, both the average regret and the market stability converge more slowly for larger markets. However, the dependence on N appears to be much better than exponential. [...] As can be seen, there is no discernible difference in the convergence of Algorithm 1 in terms of regret or market stability, for markets with different levels of preference heterogeneity. |
| Researcher Affiliation | Academia | Lydia T. Liu EMAIL Feng Ruan EMAIL Horia Mania EMAIL Michael I. Jordan EMAIL Department of Electrical Engineering and Computer Sciences and Department of Statistics University of California Berkeley, CA 94720-1776, USA |
| Pseudocode | Yes | Algorithm 1 CA-UCB with random delays |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code, a link to a repository, or mention of code in supplementary materials. |
| Open Datasets | No | We examine balanced markets of size N {5, 10, 15, 20}, and sample each player s and arm s ordinal preferences uniformly at random. For all players the reward gaps between consecutively ranked arms are chosen to be equal to = 1, regardless of the market size. The rewards are normally distributed with unit variance. We sampled ten markets as such, and run Algorithm 1 once on each market. |
| Dataset Splits | No | The paper describes simulating data by sampling 'ten markets' and running Algorithm 1 on each. It does not mention any train/test/validation splits of a dataset in the traditional sense. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the simulations or experiments. |
| Software Dependencies | No | The paper does not provide any specific software names with version numbers used for the experiments. |
| Experiment Setup | Yes | For all experiments we use Algorithm 1 with delay probability λ = 0.1. We now present the details of our simulations. Varying the size of the market. We examine balanced markets of size N {5, 10, 15, 20}, and sample each player s and arm s ordinal preferences uniformly at random. For all players the reward gaps between consecutively ranked arms are chosen to be equal to = 1, regardless of the market size. The rewards are normally distributed with unit variance. [...] for horizon T up to 5000. [...] The parameter β > 0 determines the degree of correlation between the players preferences. As β increases the correlation between the players preferences also increases. |