reproducibilityindex.ai

Doubly-Robust Lasso Bandit

Authors: Gi-Soo Kim, Myunghee Cho Paik

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct simulation studies to evaluate the proposed DR Lasso Bandit and the Lasso bandit (Bastani and Bayati, 2015). We set N = 10, 20, 50, or 100, d = 100, and s0 = 5. For ﬁxed j = 1, , d, we generate [b1j(t), , b Nj(t)]T from N(0N, V ) where V (i, i) = 1 for every i and V (i, k) = ρ2 for every i = k. We experiment two cases for ρ2, either ρ2 = 0.3 (weak correlation) or ρ2 = 0.7 (strong correlation). We generate ηi(t) i.i.d. N(0, 0.052) and the rewards from (1). We set \|\|β\|\|0 = s0 and generate the s0 non-zero elements from a uniform distribution on [0, 1]. We conduct 10 replications for each case. The Lasso Bandit algorithm can be applied in our setting by considering a Nd-dimensional context vector b(t) = [b1(t)T , , b N(t)T ]T and a Nd-dimensional regression parameter βi for each arm i where βi = [I(i = 1)βT , , I(i = N)βT ]T . For each algorithm, we consider some candidates for the tuning parameters and report the best results. For DR Lasso Bandit, we advise to truncate the value ˆr(t) so that it does not explode. Figure 1 shows the cumulative regret R(t) according to time t.
Researcher Affiliation	Academia	Gi-Soo Kim Department of Statistics Seoul National University gisoo1989@snu.ac.kr Myunghee Cho Paik Department of Statistics Seoul National University myungheechopaik@snu.ac.kr
Pseudocode	Yes	Algorithm 1 DR Lasso Bandit
Open Source Code	No	The paper does not provide concrete access to source code for the described methodology. No links to repositories or statements of code release are found.
Open Datasets	Yes	Yahoo! Webscope. Yahoo! Front Page Today Module User Click Log Dataset, version 1.0. http: //webscope.sandbox.yahoo.com. Accessed: 09/01/2019.
Dataset Splits	No	The paper describes simulation studies and uses a dataset, but it does not specify explicit train/validation/test splits or cross-validation setup for reproducibility.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., library or solver names with versions) needed to replicate the experiment.
Experiment Setup	Yes	Algorithm 1 DR Lasso Bandit Input parameters: λ1, λ2, z T [...] For each algorithm, we consider some candidates for the tuning parameters and report the best results. For DR Lasso Bandit, we advise to truncate the value ˆr(t) so that it does not explode. [...] We set N = 10, 20, 50, or 100, d = 100, and s0 = 5. For ﬁxed j = 1, , d, we generate [b1j(t), , b Nj(t)]T from N(0N, V ) where V (i, i) = 1 for every i and V (i, k) = ρ2 for every i = k. We experiment two cases for ρ2, either ρ2 = 0.3 (weak correlation) or ρ2 = 0.7 (strong correlation). We generate ηi(t) i.i.d. N(0, 0.052) and the rewards from (1). We set \|\|β\|\|0 = s0 and generate the s0 non-zero elements from a uniform distribution on [0, 1].