Beyond task diversity: provable representation transfer for sequential multitask linear bandits

Authors: Thang Duong, Zhi Wang, Chicheng Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also demonstrate empirically on synthetic data that our algorithm outperforms baseline algorithms, which rely on the task diversity assumption. In this section, we compare the performance of our BOSS algorithm with the baselines on synthetic environments.
Researcher Affiliation Academia Thang Duong University of Arizona thangduong@arizona.edu Zhi Wang University of Wisconsin Madison zhi.wang@wisc.edu Chicheng Zhang University of Arizona chichengz@cs.arizona.edu
Pseudocode Yes Algorithm 1 Meta-Exploration procedure, Algorithm 2 Meta-Exploitation procedure, Algorithm 3 BOSS: Bandit Online Subspace Selection for Sequential Multitask Linear Bandits.
Open Source Code Yes The code for our paper can be found at https://github.com/duongnhatthang/BOSS
Open Datasets No The paper mentions "synthetic data" and a "simulated adversarial environment" but does not provide a link, DOI, or formal citation for accessing this data. No public dataset is referenced.
Dataset Splits No The paper uses synthetic data and does not specify any training, validation, or test splits. It sets a global parameter `(N, τ, d, m) = (4000, 500, 10, 3)` for the simulated environment but does not detail how data points within this simulation are partitioned for training, validation, or testing.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments, such as CPU/GPU models, memory, or specific computing platforms.
Software Dependencies No The paper mentions that the code is available on GitHub but does not list any specific software dependencies with version numbers (e.g., Python version, specific libraries like PyTorch or TensorFlow, or their versions).
Experiment Setup Yes The setting is (N, τ, d, m) = (4000, 500, 10, 3). The hyper-parameters p, τ1, τ2, and α of all algorithms, where it applies, are tuned. In Figure 2/3, specific hyperparameters like τ1 = 400, τ2 = 50 and τ1 = 1000, τ2 = 300 are mentioned for different experiments.