On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits
Authors: Weitong Zhang, Jiafan He, Zhiyuan Fan, Quanquan Gu
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on both synthetic and real-world datasets corroborate our theoretical results. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of California, Los Angeles, California, USA 2IIIS, Tsinghua University, Beijing, China. Correspondence to: Quanquan Gu <qgu@cs.ucla.edu>. |
| Pseudocode | Yes | Algorithm 1 Data Selection OFUL (DS-OFUL) Algorithm 2 Sup Lin UCB |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | To demonstrate that the proposed algorithm can be easily applied to modern machine learning tasks, we carried out experiments on the Asirra dataset (Elson et al., 2007). |
| Dataset Splits | No | The paper discusses the total number of rounds (K) for experiments but does not explicitly provide details on train/validation/test dataset splits, percentages, or methodology for reproducibility. |
| Hardware Specification | Yes | The experiment on synthetic dataset is conducted on Google Colab with a 2-core Intel Xeon CPU @ 2.20GHz. The experiment on the real-world Asirra dataset (Elson et al., 2007) is conducted on an AWS p2xlarge instance. |
| Software Dependencies | No | The paper mentions models like 'Res Net-18' but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or specific library versions). |
| Experiment Setup | Yes | We do a grid search for β = {1, 3, 10}, λ = {1, 3, 10} and report the cumulative regret of Algorithm 1 with different parameter Γ = {0, 0.02, 0.05, 0.08, 0.18} over 8 independent trials with total rounds K = 10000. For hyper-parameter tuning, we select β = {0.1, 0.3, 1} and λ = {1, 3, 10} by doing a grid search and repeat the experiments for 8 times over 1M rounds for each parameter configuration. |