reproducibilityindex.ai

On Infinite-Width Hypernetworks

Authors: Etai Littwin, Tomer Galanti, Lior Wolf, Greg Yang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We verify our theory empirically and also demonstrate the utility of this hyperkernel on several functional representation tasks. Our experiments are divided into two main parts. In the ﬁrst part, we validate the ideas presented in our theoretical analysis and study the effect of the width and depth of g on the optimization of a hypernetwork. In the second part, we evaluate the performance of the NNGP and NTK kernels on image representation tasks.
Researcher Affiliation	Collaboration	Etai Littwin School of Computer Science Tel Aviv University Tel Aviv, Israel etai.littwin@gmail.com Tomer Galanti School of Computer Science Tel Aviv University Tel Aviv, Israel tomerga2@tauex.tauex.ac.il Lior Wolf School of Computer Science Tel Aviv University Tel Aviv, Israel wolf@cs.ac.il Greg Yang Microsoft Research AI gregyang@microsoft.com
Pseudocode	No	The paper provides mathematical derivations and equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository.
Open Datasets	Yes	We experimented with the MNIST [18] and CIFAR10 [17] datasets. For each dataset we took 10000 training samples only.
Dataset Splits	No	The paper mentions '10000 training samples' and uses a 'test set', but it does not specify explicit percentages or sample counts for training, validation, and test splits, nor does it mention how these splits were performed or if a validation set was used.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, or cloud instances) used for running the experiments.
Software Dependencies	No	The paper mentions implementing the models and training process but does not provide specific version numbers for any software dependencies, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	The hypernetwork, f, is a fully-connected Re LU neural network of depth 4 and width 200. The primary network g is a fully-connected Re LU neural network of depth {3, 6, 8}. Since the MNIST rotations dataset is simpler, we varied the width of g in {10, 50, 100} and for the the CIFAR10 variation we selected the width of g to be {100, 200, 300}. The network outputs 12 values and is trained using the cross-entropy loss. We trained the hypernetworks for 100 epochs, using the SGD method with batch size 100 and learning rate µ = 0.01.