Statistically and Computationally Efficient Linear Meta-representation Learning

Authors: Kiran K. Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we empirically compare the performance of Alt Min GD (Algorithm 1) and its exact minimization variant Alt Min (Algorithm 3 in Appendix), two different versions of Method-of-Moments (Mo M, Mo M2), and simultaneous gradient descent on (U, V ) using the Burer Monteiro factorized loss (4) (BM-GD [48]).
Researcher Affiliation Collaboration Kiran Koshy Thekumparampil , Prateek Jain , Praneeth Netrapalli , Sewoong Oh University of Illinois at Urbana-Champaign, Google Research India, Microsoft Research India, University of Washington, Seattle
Pseudocode Yes Algorithm 1 Alt Min GD : Meta-learning linear regression parameters via alternating minimization gradient descent
Open Source Code Yes 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets No The paper defines a synthetic data generation process based on linear regression tasks with Gaussian noise (Section 2, Assumptions 1 and 2), rather than using a pre-existing public dataset with concrete access information.
Dataset Splits No The paper describes the synthetic data generation process and the total number of samples per task (m) but does not provide explicit training/validation/test splits, such as percentages or sample counts.
Hardware Specification No The authors explicitly state: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] The amount of compute required to run our simulations is trivial'
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) used in the experiments.
Experiment Setup No The paper states that 'more experiments details and plots are provided in Appendix H', but these specific details (e.g., concrete hyperparameter values for K and η used in the experiments) are not present in the main body of the paper provided.