reproducibilityindex.ai

Hopfield Networks is All You Need

Authors: Hubert Ramsauer, Bernhard Schäfl, Johannes Lehner, Philipp Seidl, Michael Widrich, Lukas Gruber, Markus Holzleitner, Thomas Adler, David Kreil, Michael K Kopp, Günter Klambauer, Johannes Brandstetter, Sepp Hochreiter

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the broad applicability of the Hopﬁeld layers across various domains. Hopﬁeld layers improved state-of-art on three out of four considered multiple instance learning problems as well as on immune repertoire classiﬁcation with several hundreds of thousands of instances. On the UCI benchmark collections of small classiﬁcation tasks, where deep learning methods typically struggle, Hopﬁeld layers yielded a new state-ofthe-art when compared to different machine learning methods. Finally, Hopﬁeld layers achieved state-of-the-art on two drug design datasets.
Researcher Affiliation	Academia	Hubert Ramsauer Bernhard Schäﬂ Johannes Lehner Philipp Seidl Michael Widrich Thomas Adler Lukas Gruber Markus Holzleitner David Kreil Michael Kopp Günter Klambauer Johannes Brandstetter Sepp Hochreiter , ELLIS Unit Linz, LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria Institute of Advanced Research in Artiﬁcial Intelligence (IARAI) Email: {ramsauer,schaeﬂ,brandstetter,hochreit}@ml.jku.at
Pseudocode	No	The paper provides mathematical formulations and derivations but does not include structured pseudocode or algorithm blocks (e.g., labeled "Pseudocode" or "Algorithm").
Open Source Code	Yes	The implementation is available at: https://github.com/ml-jku/hopfield-layers
Open Datasets	Yes	On the UCI benchmark collections of small classiﬁcation tasks, where deep learning methods typically struggle, Hopﬁeld layers yielded a new state-ofthe-art when compared to different machine learning methods.
Dataset Splits	Yes	All models were trained for 100 epochs with a mini-batch size of 4 samples using the cross entropy loss and the Py Torch SGD module for stochastic gradient descent without momentum and without weight decay or dropout. After each epoch, the model accuracy was computed on a separated validation set. Using early stopping, the model with the best validation set accuracy averaged over 16 consecutive epochs was selected as ﬁnal model.
Hardware Specification	Yes	The training of such a BERT-small model for 1.45 million update steps takes roughly four days on a single NVIDIA V100 GPU.
Software Dependencies	No	The paper mentions software like "Hugging Face Inc." and "Py Torch", but does not specify exact version numbers for these or other libraries/packages.
Experiment Setup	Yes	For the MIL datasets... Among other hyperparameters, different hidden layer widths (for the fully connected pre- and post-Hopfield Pooling layers), learning rates and batch sizes were tried. Additionally our focus resided on the hyperparameters of the Hopfield Pooling layer. Among those were the number of heads, the head dimension and the scaling factor β. All models were trained for 160 epochs using the Adam W optimizer (Loshchilov & Hutter, 2017) with exponential learning rate decay (see Table A.2).