Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Statistically and Computationally Efficient Linear Meta-representation Learning
Authors: Kiran K. Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we empirically compare the performance of Alt Min GD (Algorithm 1) and its exact minimization variant Alt Min (Algorithm 3 in Appendix), two different versions of Method-of-Moments (Mo M, Mo M2), and simultaneous gradient descent on (U, V ) using the Burer Monteiro factorized loss (4) (BM-GD [48]). |
| Researcher Affiliation | Collaboration | Kiran Koshy Thekumparampil , Prateek Jain , Praneeth Netrapalli , Sewoong Oh University of Illinois at Urbana-Champaign, Google Research India, Microsoft Research India, University of Washington, Seattle |
| Pseudocode | Yes | Algorithm 1 Alt Min GD : Meta-learning linear regression parameters via alternating minimization gradient descent |
| Open Source Code | Yes | 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | No | The paper defines a synthetic data generation process based on linear regression tasks with Gaussian noise (Section 2, Assumptions 1 and 2), rather than using a pre-existing public dataset with concrete access information. |
| Dataset Splits | No | The paper describes the synthetic data generation process and the total number of samples per task (m) but does not provide explicit training/validation/test splits, such as percentages or sample counts. |
| Hardware Specification | No | The authors explicitly state: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] The amount of compute required to run our simulations is trivial' |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) used in the experiments. |
| Experiment Setup | No | The paper states that 'more experiments details and plots are provided in Appendix H', but these specific details (e.g., concrete hyperparameter values for K and η used in the experiments) are not present in the main body of the paper provided. |