Pure Exploration and Regret Minimization in Matching Bandits
Authors: Flore Sentenac, Jialin Yi, Clement Calauzenes, Vianney Perchet, Milan Vojnovic
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we present numerical results in Section 5. They validate the tightness of our results and demonstrate competitiveness and performance gains obtained by our proposed algorithms over some state-of-the-art baseline algorithms. |
| Researcher Affiliation | Collaboration | 1CREST, ENSAE Paris, Palaiseau, France 2London School of Economics, London, UK 3Criteo AI Lab, Paris, France 4CREST, ENSAE Paris, Palaiseau, France & Criteo AI Lab, Paris, France. |
| Pseudocode | Yes | Algorithm Sketch 1 PAIR-ELIM, Algorithm Sketch 2 PAIR-ELIM-MONO, Algorithm Sketch 3 MATCHING-ID, and Algorithm 4 SIMPLE-ADAPTIVE-MATCHING are all present in the paper. |
| Open Source Code | Yes | All the code used for obtaining the results in this section is available from this public Gitlab repository: [anonymized] |
| Open Datasets | No | The experiments are conducted on 'problem instances' generated based on specified parameters (N, u1, etc.) and Bernoulli variables, rather than a named, publicly accessible dataset. There is no link, DOI, or formal citation provided for a public dataset. |
| Dataset Splits | No | The paper defines 'problem instances' for its simulations but does not describe specific train, validation, or test dataset splits (e.g., percentages or sample counts) for reproducibility of data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies or libraries with version numbers (e.g., 'Python 3.8, PyTorch 1.9'). |
| Experiment Setup | Yes | We consider the bipartite case with N = M. Each problem instance is defined by a tuple (N, u1, ), where 0 u1 1, and assuming that u1 = v1. and The results are shown in Figure 1. We have further evaluated the effects of varying u1 which is discussed in Appendix A.1. and We consider a set of problem instances, each is defined by a tuple (N, ), where 0 < (N 1) 1 and is the gap between parameter values of adjacent matched pairs in the optimum matching, so that = min. The parameter values are defined by u2i 1 = u2i = (N i) for i [N]. and The results shown in Figure 2 suggest that the cumulative regret of SIMPLE-ADAPTIVE-MATCHING scales as (1/ min)N log(N) as established in Theorem 4.3. We next compare the performance of our algorithm and ESCB. We consider problem instances defined by a tuple (N, µ, ) where µ is the mean of parameter values and is the gap between parameter values. The values of item parameters are set as u2i 1 = u2i = µ+(N+1 2i) /2 [0, 1] for i [N]. Such problem instances allow us to vary µ while keeping other parameters fixed; we fix N = 4 and = 0.1. |