Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Tight conditions for when the NTK approximation is valid

Authors: Enric Boix-Adserร , Etai Littwin

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study when the neural tangent kernel (NTK) approximation is valid for training a model with the square loss. In the lazy training setting of Chizat et al. (2019), we show that rescaling the model by a factor of ฬ– = O(T) suffices for the NTK approximation to be valid until training time T. Our bound is tight and improves on the previous bound of Chizat et al. (2019), which required a larger rescaling factor of ฬ– = O(T 2). Our contribution is to refine the bound of Chizat et al. (2019) for large time scales. We prove: Theorem 1.2 (NTK approximation error bound). ... Furthermore, the converse is true. Our bound is tight up to a constant factor. Theorem 1.3 (Converse to Theorem 1.2).
Researcher Affiliation Collaboration Enric Boix-Adsera EMAIL MIT Electrical Engineering and Computer Science Apple; Etai Littwin EMAIL Apple
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. The content is primarily mathematical proofs and derivations.
Open Source Code No The paper does not contain any explicit statements regarding the release of open-source code for the described methodology, nor does it provide links to any code repositories.
Open Datasets No The paper is theoretical and does not conduct experiments using datasets. It mentions 'CIFAR10' in the context of discussing related work by Chizat et al. (2019), but does not use or provide access information for any dataset as part of its own methodology.
Dataset Splits No The paper is theoretical and does not involve experimental evaluation on datasets, therefore it does not provide any information regarding dataset splits.
Hardware Specification No The paper is purely theoretical and does not describe any experimental setup or the hardware used for computations. Therefore, no hardware specifications are provided.
Software Dependencies No The paper is theoretical and does not describe any experimental implementation, thus no specific software dependencies with version numbers are provided.
Experiment Setup No The paper is purely theoretical and focuses on mathematical proofs and analysis, therefore it does not provide details on experimental setup, including hyperparameters or training settings.