Stochastic Contextual Bandits with Long Horizon Rewards

Authors: Yuzhen Qin, Yingcong Li, Fabio Pasqualetti, Maryam Fazel, Samet Oymak

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We further provide numerical experiments in Fig. 2 to show that circulant measurements are indeed problematic while dealing with low-rank matrix recovery. and We perform some experiments to compare our algorithm AD-Lasso with the following three:
Researcher Affiliation Academia Yuzhen Qin1, Yingcong Li1, Fabio Pasqualetti1, Maryam Fazel2, Samet Oymak1,3 1 University of California, Riverside 2 University of Washington 3 University of Michigan
Pseudocode Yes Algorithm 1: Doubling Lasso and Algorithm 2: Adaptive Doubling Lasso (AD-Lasso)
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper, nor does it contain explicit statements about code release or specific repository links.
Open Datasets No The paper mentions context vectors drawn i.i.d. from an unknown distribution ν and uses specific parameters for experiments (e.g., h=1000, T=999, s=10, d=5), but does not provide concrete access information for a publicly available or open dataset, nor does it reference established benchmark datasets with proper attribution or links.
Dataset Splits No The paper describes a sequential decision-making problem (bandit problem) over a total time horizon T, where the agent continuously interacts with the environment. It does not provide specific dataset split information (percentages, sample counts, or predefined splits) for training, validation, or testing in the traditional sense of a static dataset.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes In Algorithm 1, choose L = csd log2(sd) log2(hd), where c > 0 is a constant. and Parameters: h = 1000, T = 999, and s = 10. and (Universal parameters: T = 2000, h = 100, d = 5, and w 1 = 1.)