reproducibilityindex.ai

Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery

Authors: Yassir Jedra, William Réveillard, Stefan Stojanovic, Alexandre Proutiere

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	J. Numerical experiments. We perform numerical experiments on synthetic data with a uniform context distribution.
Researcher Affiliation	Academia	1Laboratory for Information and Decision Systems, MIT, Cambridge, MA, USA 2Division of Decision and Control Systems, KTH, Stockholm, Sweden.
Pseudocode	Yes	Algorithm 1 RECOVER SUBSPACE FOR BEST POLICY IDENTIFICATION (RS-BPI)
Open Source Code	Yes	The code used in the experiments can be accessed at https://github.com/wilrev/Low Rank Bandits Two To Infinity.
Open Datasets	No	We perform numerical experiments on synthetic data with a uniform context distribution. Unless specified otherwise, the behavior policy is uniform, the target policy is chosen as the best policy: π(i) = arg maxj Mi,j (ties are broken arbitrarily), and we generate noisy entries Mit,jt + ξt where ξt N (0, 1) is standard Gaussian, and where M = PDQ for two invertible matrices P Rm m, Q Rn n, and D Rm n defined by Di,j = 1i=j1i r (note that M is consequently of rank r). P and Q are initially generated at random with uniform entries in [0, 1] and their diagonal elements are replaced by the sum of the corresponding row to ensure invertibility.
Dataset Splits	No	The paper uses synthetic data generated internally and describes experimental repetitions and confidence intervals, but it does not specify explicit train/validation/test dataset splits or reference any standard split methodologies for public datasets.
Hardware Specification	No	The paper mentions running numerical experiments but does not provide specific details about the hardware used, such as GPU/CPU models or memory specifications.
Software Dependencies	No	The paper does not explicitly list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or solver versions).
Experiment Setup	Yes	For a regularization parameter of τ = 10 4, we compare the performance of RS-PE for α {1/5, 1/2, 4/5}, where α is the proportion of samples used in the first phase of the algorithm.