Bandits with Replenishable Knapsacks: the Best of both Worlds
Authors: Martino Bernasconi, Matteo Castiglioni, Andrea Celli, Federico Fusco
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose a best-of-both-worlds primal-dual template that can handle any online learning problem with replenishment for which a suitable primal regret minimizer exists. In particular, we provide the first positive results for the case of adversarial inputs by showing that our framework guarantees a constant competitive ratio α when B = Ω(T) or when the possible per-round replenishment is a positive constant. Moreover, under a stochastic input model, our algorithm yields an instance-independent O(T 1/2) regret bound which complements existing instance-dependent bounds for the same setting. Finally, we provide applications of our framework to some economic problems of practical relevance. |
| Researcher Affiliation | Academia | Martino Bernasconi Bocconi University martino.bernasconi@unibocconi.it Matteo Castiglioni Politecnico di Milano matteo.castiglioni@polimi.it Andrea Celli Bocconi University andrea.celli2@unibocconi.it Federico Fusco Sapienza University federico.fusco@uniroma1.it |
| Pseudocode | Yes | Algorithm 1 Primal-Dual template |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper is theoretical and does not report on experiments using specific datasets for training or evaluation. It discusses theoretical models like the "stochastic setting" with i.i.d. samples, which are abstract problem formulations, not actual datasets. |
| Dataset Splits | No | The paper does not report empirical experiments or dataset usage, therefore, no training/validation/test splits are specified. |
| Hardware Specification | No | The paper is theoretical and does not report on empirical experiments, therefore, no hardware specifications are mentioned. |
| Software Dependencies | No | The paper discusses algorithms and theoretical frameworks (e.g., EXP3-SIX, Online Gradient Descent) but does not specify any software libraries, tools, or their version numbers used for implementation. |
| Experiment Setup | No | The paper is theoretical and does not report on empirical experiments, therefore, no experimental setup details like hyperparameters or training configurations are provided. |