Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks
Authors: Greg Yang, Edward J. Hu
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present our main experiments (Omniglot and Word2Vec) in the main text, while we also empirically verified the validity of our infinite-width theory in various toy settings in Appendix J.1.Our results are summarized in the Fig. 3 and Table 2. |
| Researcher Affiliation | Industry | Greg Yang 1 Microsoft Research AI 2Microsoft Dynamics 365 AI 3Work done partly during the Microsoft AI Residency Program. Correspondence to: Greg Yang <gregyang@microsoft.com>. |
| Pseudocode | No | The paper describes algorithms and a framework but does not include structured pseudocode or algorithm blocks with specific labels. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | We compare finite- and infinite-width models on Omniglot (Lake et al., 2015), a standard few-shot learning benchmark... and We pretrain our models on two standard datasets, text8 and fil9. For a more thorough review of Word2Vec and a description of the datasets, see Appendix J.3. |
| Dataset Splits | No | The paper describes the '1-shot, 5-way classification task' for Omniglot, implying a training setup for each task, but does not provide specific train/validation/test dataset splits (percentages or counts) for the overall datasets used. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running its experiments, such as exact GPU/CPU models or memory specifications. |
| Software Dependencies | No | The paper does not provide specific software dependency details with version numbers, such as programming language versions or library versions. |
| Experiment Setup | Yes | The hyperparameters for our experiments can be found in Appendix J.2. |