Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery
Authors: Yassir Jedra, William Réveillard, Stefan Stojanovic, Alexandre Proutiere
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | J. Numerical experiments. We perform numerical experiments on synthetic data with a uniform context distribution. |
| Researcher Affiliation | Academia | 1Laboratory for Information and Decision Systems, MIT, Cambridge, MA, USA 2Division of Decision and Control Systems, KTH, Stockholm, Sweden. |
| Pseudocode | Yes | Algorithm 1 RECOVER SUBSPACE FOR BEST POLICY IDENTIFICATION (RS-BPI) |
| Open Source Code | Yes | The code used in the experiments can be accessed at https://github.com/wilrev/Low Rank Bandits Two To Infinity. |
| Open Datasets | No | We perform numerical experiments on synthetic data with a uniform context distribution. Unless specified otherwise, the behavior policy is uniform, the target policy is chosen as the best policy: π(i) = arg maxj Mi,j (ties are broken arbitrarily), and we generate noisy entries Mit,jt + ξt where ξt N (0, 1) is standard Gaussian, and where M = PDQ for two invertible matrices P Rm m, Q Rn n, and D Rm n defined by Di,j = 1i=j1i r (note that M is consequently of rank r). P and Q are initially generated at random with uniform entries in [0, 1] and their diagonal elements are replaced by the sum of the corresponding row to ensure invertibility. |
| Dataset Splits | No | The paper uses synthetic data generated internally and describes experimental repetitions and confidence intervals, but it does not specify explicit train/validation/test dataset splits or reference any standard split methodologies for public datasets. |
| Hardware Specification | No | The paper mentions running numerical experiments but does not provide specific details about the hardware used, such as GPU/CPU models or memory specifications. |
| Software Dependencies | No | The paper does not explicitly list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or solver versions). |
| Experiment Setup | Yes | For a regularization parameter of τ = 10 4, we compare the performance of RS-PE for α {1/5, 1/2, 4/5}, where α is the proportion of samples used in the first phase of the algorithm. |