Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Tight conditions for when the NTK approximation is valid
Authors: Enric Boix-Adserร , Etai Littwin
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study when the neural tangent kernel (NTK) approximation is valid for training a model with the square loss. In the lazy training setting of Chizat et al. (2019), we show that rescaling the model by a factor of ฬ = O(T) suffices for the NTK approximation to be valid until training time T. Our bound is tight and improves on the previous bound of Chizat et al. (2019), which required a larger rescaling factor of ฬ = O(T 2). Our contribution is to refine the bound of Chizat et al. (2019) for large time scales. We prove: Theorem 1.2 (NTK approximation error bound). ... Furthermore, the converse is true. Our bound is tight up to a constant factor. Theorem 1.3 (Converse to Theorem 1.2). |
| Researcher Affiliation | Collaboration | Enric Boix-Adsera EMAIL MIT Electrical Engineering and Computer Science Apple; Etai Littwin EMAIL Apple |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. The content is primarily mathematical proofs and derivations. |
| Open Source Code | No | The paper does not contain any explicit statements regarding the release of open-source code for the described methodology, nor does it provide links to any code repositories. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments using datasets. It mentions 'CIFAR10' in the context of discussing related work by Chizat et al. (2019), but does not use or provide access information for any dataset as part of its own methodology. |
| Dataset Splits | No | The paper is theoretical and does not involve experimental evaluation on datasets, therefore it does not provide any information regarding dataset splits. |
| Hardware Specification | No | The paper is purely theoretical and does not describe any experimental setup or the hardware used for computations. Therefore, no hardware specifications are provided. |
| Software Dependencies | No | The paper is theoretical and does not describe any experimental implementation, thus no specific software dependencies with version numbers are provided. |
| Experiment Setup | No | The paper is purely theoretical and focuses on mathematical proofs and analysis, therefore it does not provide details on experimental setup, including hyperparameters or training settings. |