Stochastic Contextual Bandits with Long Horizon Rewards
Authors: Yuzhen Qin, Yingcong Li, Fabio Pasqualetti, Maryam Fazel, Samet Oymak
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We further provide numerical experiments in Fig. 2 to show that circulant measurements are indeed problematic while dealing with low-rank matrix recovery. and We perform some experiments to compare our algorithm AD-Lasso with the following three: |
| Researcher Affiliation | Academia | Yuzhen Qin1, Yingcong Li1, Fabio Pasqualetti1, Maryam Fazel2, Samet Oymak1,3 1 University of California, Riverside 2 University of Washington 3 University of Michigan |
| Pseudocode | Yes | Algorithm 1: Doubling Lasso and Algorithm 2: Adaptive Doubling Lasso (AD-Lasso) |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper, nor does it contain explicit statements about code release or specific repository links. |
| Open Datasets | No | The paper mentions context vectors drawn i.i.d. from an unknown distribution ν and uses specific parameters for experiments (e.g., h=1000, T=999, s=10, d=5), but does not provide concrete access information for a publicly available or open dataset, nor does it reference established benchmark datasets with proper attribution or links. |
| Dataset Splits | No | The paper describes a sequential decision-making problem (bandit problem) over a total time horizon T, where the agent continuously interacts with the environment. It does not provide specific dataset split information (percentages, sample counts, or predefined splits) for training, validation, or testing in the traditional sense of a static dataset. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | In Algorithm 1, choose L = csd log2(sd) log2(hd), where c > 0 is a constant. and Parameters: h = 1000, T = 999, and s = 10. and (Universal parameters: T = 2000, h = 100, d = 5, and w 1 = 1.) |