reproducibilityindex.ai

Infinite attention: NNGP and NTK for deep attention networks

Authors: Jiri Hron, Yasaman Bahri, Jascha Sohl-Dickstein, Roman Novak

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate attention kernels empirically, leading to a moderate improvement upon the previous state-of-the-art on CIFAR-10 for GPs without trainable kernels and advanced data preprocessing.
Researcher Affiliation	Collaboration	1University of Cambridge. Work done while interning at Google Brain. 2Google Brain. Correspondence to: Jiri Hron <jh2084@cam.ac.uk>.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Finally, since attention is often applied to text datasets, we release code allowing applications of NNGP/NTK models to variable-length sequences, including an example on the IMDb reviews dataset. Our implementation seamlessly extends the Neural Tangents library... The Neural Tangents library (Novak et al., 2020) is cited with URL: https://github.com/google/neural-tangents.
Open Datasets	Yes	We evaluate the attention NNGP/NTK kernels on the CIFAR-10 (Krizhevsky, 2009) and IMDb reviews (Maas et al., 2011) datasets.
Dataset Splits	Yes	The smaller scale experiments were run on a randomly selected subset of six thousand observations from the training set, with the 2K/4K train/validation split. Selected hyperparameters were then employed in the larger scale experiment with the usual 50K/10K train/test split. IMDb sentiment classiﬁcation, test accuracies of simple NNGP/NTK models on the 25K/25K train/test split...
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper states that 'Our experimental code utilises the JAX (Bradbury et al., 2018) and Neural Tangents (Novak et al., 2020) libraries,' but does not specify version numbers for these software components or other dependencies like Python or CUDA versions.
Experiment Setup	No	The paper states that 'Exact details regarding data normalisation, hyperparameter tuning, and other experimental settings can be found in Appendix A.', thus deferring the specific details from the main text.