Bandits with Knapsacks: Advice on Time-Varying Demands

Authors: Lixing Lyu, Wang Chi Cheung

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theoretical results are corroborated by our numerical findings. We perform numerical experiments when {qt}T t=1 is governed by a time series model. The experiment highlights the benefit of predicitons. We show that an online algorithm, such as OA-UCB, that harnesses predictions judiciously can perform empirically better than existing baselines, which only has access to the bandit feedback from the latent environment.
Researcher Affiliation Academia 1Institute of Operations Research and Analytics, National University of Singapore, Singapore 2Department of Industrial Systems Engineering and Management, National University of Singapore, Singapore.
Pseudocode Yes Algorithm 1 Online-advice-UCB (OA-UCB); Algorithm 2 Estimation Generation Policy
Open Source Code No The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets No Demand sequence {qt}T t=1: We apply an AR(1) model to generate {qt}: qt = α + βqt−1 + εt, where ε1, . . . εT ∼ N(0, σ2) are independent. We simulate our algorithm and the benchmarks on a family of instances, with K = 10, d = 3, b = 3, α = 2, β = 0.5, σ = 0.5, and T varies from 5000 to 15000.
Dataset Splits No The paper describes generating synthetic data and simulating experiments over a horizon T, but it does not specify any training, validation, or test dataset splits in the conventional sense (e.g., percentages or counts of a fixed dataset).
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as CPU/GPU models or memory specifications.
Software Dependencies No The paper mentions "time series prediction tools in Python, MatLab or R" but does not specify any software libraries or packages with version numbers needed to replicate the experiments.
Experiment Setup Yes In the experiment, we simulate our algorithm and the benchmarks on a family of instances, with K = 10, d = 3, b = 3, α = 2, β = 0.5, σ = 0.5, and T varies from 5000 to 15000.