High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Authors: Jimmy Ba, Murat A. Erdogdu, Taiji Suzuki, Zhichao Wang, Denny Wu, Greg Yang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Markers represent empirical simulations and solid curves are predicted asymptotic values; red line indicates Θ(d/n) rate. |
| Researcher Affiliation | Collaboration | University of Toronto and Vector Institute, University of Tokyo and RIKEN AIP, University of California, San Diego, Microsoft Research AI |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about open-source code availability or a link to a code repository for the methodology described. |
| Open Datasets | No | The paper mentions using training data and specifies its distribution (e.g., "xi i.i.d. N(0, I)"), but it does not provide concrete access information (link, DOI, specific citation with author/year) for a publicly available dataset, even if synthetic. |
| Dataset Splits | No | The paper does not explicitly mention training/validation/test dataset splits with percentages or sample counts, nor does it refer to predefined standard splits for specific datasets. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU or CPU models, or cloud computing instance types, used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9, and CUDA 11.1'). |
| Experiment Setup | No | The paper describes theoretical parameters like learning rate scalings (η = Θ(1), η = Θ(N)), but it does not provide specific experimental setup details such as concrete hyperparameter values (e.g., numerical learning rate, batch size, number of epochs, optimizer settings) used for the empirical simulations shown in the figures. |