Statistically and Computationally Efficient Linear Meta-representation Learning
Authors: Kiran K. Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we empirically compare the performance of Alt Min GD (Algorithm 1) and its exact minimization variant Alt Min (Algorithm 3 in Appendix), two different versions of Method-of-Moments (Mo M, Mo M2), and simultaneous gradient descent on (U, V ) using the Burer Monteiro factorized loss (4) (BM-GD [48]). |
| Researcher Affiliation | Collaboration | Kiran Koshy Thekumparampil , Prateek Jain , Praneeth Netrapalli , Sewoong Oh University of Illinois at Urbana-Champaign, Google Research India, Microsoft Research India, University of Washington, Seattle |
| Pseudocode | Yes | Algorithm 1 Alt Min GD : Meta-learning linear regression parameters via alternating minimization gradient descent |
| Open Source Code | Yes | 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | No | The paper defines a synthetic data generation process based on linear regression tasks with Gaussian noise (Section 2, Assumptions 1 and 2), rather than using a pre-existing public dataset with concrete access information. |
| Dataset Splits | No | The paper describes the synthetic data generation process and the total number of samples per task (m) but does not provide explicit training/validation/test splits, such as percentages or sample counts. |
| Hardware Specification | No | The authors explicitly state: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] The amount of compute required to run our simulations is trivial' |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) used in the experiments. |
| Experiment Setup | No | The paper states that 'more experiments details and plots are provided in Appendix H', but these specific details (e.g., concrete hyperparameter values for K and η used in the experiments) are not present in the main body of the paper provided. |