Exploiting structure of uncertainty for efficient matroid semi-bandits
Authors: Pierre Perrault, Vianney Perchet, Michal Valko
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide experiments for a graphic matroid, on a five nodes complete graph, as did Combes et al. (2015). We thus have n = 10, m = 4. We consider two experiments. In the first one we use A = B, µ i = 1 + I{i m}, for all i [n], and in the second, A = I, where we set µ i = (2I{i m 1} 1), i [n]. We take = 0.1, with rewards drawn from independent unit variance Gaussian distributions. Figure 2 illustrates the comparison between CUCB and our implementations of ESCB (Combes et al., 2015) using Algorithm 3 (left) and 2 (right, with ε = 0.1), showing the behavior of the regret vs. time horizon. We observe that although we are approximating the confidence region within a factor at least 2 (and thus force the exploration), our efficient implementation outperforms CUCB. Indeed, we take advantage (gaining a factor m/ log m) of the previously inefficient algorithm (that we made efficient) to use reward independence, which the more conservative CUCB is not able to. The latter algorithm has still a better per round-time complexity of O(n log m) and may be more practical on larger instances. |
| Researcher Affiliation | Collaboration | 1Seque L team, INRIA Lille Nord Europe 2CMLA, ENS Paris Saclay 3Criteo AI Lab 4now with Deep Mind. |
| Pseudocode | Yes | Algorithm 1 Generic confidence-region-based algorithm. At each round t : Find a confidence region Ct Rn. Solve the bilinear program (µt, At) arg max µ Ct,A A e T Aµ . |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | No | The paper describes synthetic data generated for the experiments ('rewards drawn from independent unit variance Gaussian distributions') but does not provide access information (link, DOI, repository, or formal citation with author/year) for a publicly available or open dataset. |
| Dataset Splits | No | The paper does not specify exact training, validation, or test dataset splits. It mentions 'averaged over 100 independent simulations' for the synthetic data generation, but this does not describe a data partitioning methodology. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We thus have n = 10, m = 4. We consider two experiments. In the first one we use A = B, µ i = 1 + I{i m}, for all i [n], and in the second, A = I, where we set µ i = (2I{i m 1} 1), i [n]. We take = 0.1, with rewards drawn from independent unit variance Gaussian distributions... using Algorithm 3 (left) and 2 (right, with ε = 0.1) |