reproducibilityindex.ai

Multi-task Representation Learning for Pure Exploration in Linear Bandits

Authors: Yihan Du, Longbo Huang, Wen Sun

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6. Experiments In this section, we present experiments to evaluate the empirical performance of our algorithms. In our experiments, we set δ = 0.005, d = 5, k = 2 and M ∈ [50, 230], where k divides M. In Rep BAI-LB, X is the canonical basis of Rd. In Rep BPI-CLB, we set ε = 0.1, \|S\| = 5 and \|A\| = 5. D is the uniform distribution on S. For any s ∈ S, {ϕ(s, a)}a∈A is the canonical basis of Rd. In both problems, B = [Ik; 0], where Ik denotes the k × k identity matrix. w1, . . . , wM are divided into k groups, with M/k same members in each group. The members in the i-th group (i ∈ [k]), i.e., w(M/k)(i−1)+1, . . . , w(M/k)i, have 1 in the i-th coordinate and 0 in all other coordinates. For any m ∈ [M], θm = Bwm. We vary M and perform 50 independent runs to report the average sample complexity across runs.
Researcher Affiliation	Academia	Yihan Du 1 Longbo Huang 1 Wen Sun 2 1IIIS, Tsinghua University 2Cornell University.
Pseudocode	Yes	Algorithm 1 Dou Exp Des (Double Experimental Design) Algorithm 2 Feat Recover(T, { xi}i [p]) Algorithm 3 Eli Low Rep(t, X,{ ˆ Xm}m [M], δ , ROUND,ζ, ˆ B) Algorithm 4 C-Dou Exp Des (Contextual Double Experimental Design) Algorithm 5 C-Feat Recover(T, { ai}i [p]) Algorithm 6 Est Low Rep(N, γ, ˆ B)
Open Source Code	No	The paper does not provide an explicit statement or link for open-source code.
Open Datasets	No	In our experiments, we set δ = 0.005, d = 5, k = 2 and M ∈ [50, 230], where k divides M. In Rep BAI-LB, X is the canonical basis of Rd. In Rep BPI-CLB, we set ε = 0.1, \|S\| = 5 and \|A\| = 5. D is the uniform distribution on S. For any s ∈ S, {ϕ(s, a)}a∈A is the canonical basis of Rd. The data used for experiments is synthetically generated according to these specifications and not a publicly available dataset with a link or citation.
Dataset Splits	No	The paper describes the synthetic generation of data and parameters for experiments, but it does not specify explicit training, validation, and test dataset splits in the traditional sense, as it focuses on sample complexity in bandit settings.
Hardware Specification	No	The paper does not specify any hardware details like GPU models, CPU types, or memory used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch versions, or specific library versions).
Experiment Setup	Yes	In our experiments, we set δ = 0.005, d = 5, k = 2 and M ∈ [50, 230], where k divides M. In Rep BAI-LB, X is the canonical basis of Rd. In Rep BPI-CLB, we set ε = 0.1, \|S\| = 5 and \|A\| = 5. D is the uniform distribution on S. For any s ∈ S, {ϕ(s, a)}a∈A is the canonical basis of Rd. In both problems, B = [Ik; 0], where Ik denotes the k × k identity matrix. w1, . . . , wM are divided into k groups, with M/k same members in each group. The members in the i-th group (i ∈ [k]), i.e., w(M/k)(i−1)+1, . . . , w(M/k)i, have 1 in the i-th coordinate and 0 in all other coordinates. For any m ∈ [M], θm = Bwm. We vary M and perform 50 independent runs to report the average sample complexity across runs.