Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks

Authors: Greg Yang, Edward J. Hu

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present our main experiments (Omniglot and Word2Vec) in the main text, while we also empirically verified the validity of our infinite-width theory in various toy settings in Appendix J.1.Our results are summarized in the Fig. 3 and Table 2.
Researcher Affiliation Industry Greg Yang 1 Microsoft Research AI 2Microsoft Dynamics 365 AI 3Work done partly during the Microsoft AI Residency Program. Correspondence to: Greg Yang <gregyang@microsoft.com>.
Pseudocode No The paper describes algorithms and a framework but does not include structured pseudocode or algorithm blocks with specific labels.
Open Source Code No The paper does not contain an explicit statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets Yes We compare finite- and infinite-width models on Omniglot (Lake et al., 2015), a standard few-shot learning benchmark... and We pretrain our models on two standard datasets, text8 and fil9. For a more thorough review of Word2Vec and a description of the datasets, see Appendix J.3.
Dataset Splits No The paper describes the '1-shot, 5-way classification task' for Omniglot, implying a training setup for each task, but does not provide specific train/validation/test dataset splits (percentages or counts) for the overall datasets used.
Hardware Specification No The paper does not provide specific details about the hardware used for running its experiments, such as exact GPU/CPU models or memory specifications.
Software Dependencies No The paper does not provide specific software dependency details with version numbers, such as programming language versions or library versions.
Experiment Setup Yes The hyperparameters for our experiments can be found in Appendix J.2.