reproducibilityindex.ai

A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel

Authors: Mohamad Amin Mohamadi, Wonho Bae, Danica J. Sutherland

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate the quality of this approximation for various uses across a range of settings.
Researcher Affiliation	Academia	Computer Science Department, University of British Columbia, Vancouver, Canada Alberta Machine Intelligence Institute, Edmonton, Canada.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	Lastly, to help the community better analyze the properties of NNs and their training dynamics, and avoid wasting computation by redoing this work, we plan to share computed p NTKs for all the mentioned architectures and widths...
Open Datasets	Yes	We focus on data from CIFAR-10 (Krizhevsky, 2009).
Dataset Splits	No	The paper mentions using CIFAR-10 for training but does not provide specific details on validation splits (e.g., percentages or sample counts).
Hardware Specification	Yes	All models are trained for 200 epochs, using stochastic gradient descent (SGD), on 32GB NVIDIA V100 GPUs.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies such as libraries or frameworks used in the experiments.
Experiment Setup	Yes	A constant batch size of 128 was used across all different networks and different dataset sizes used for training. The learning rate for all networks was also fixed to 0.1. However, not all networks were trainable with this fixed learning rate, as the gradients would sometimes blow up and give Na N training loss, typically for the largest width of each mentioned architecture. In those cases, we decreased the learning rate to 0.01 to train the networks. ... a weight decay of 0.0001 along with a momentum of 0.9 for SGD is used.