reproducibilityindex.ai

Learning Useful Representations of Recurrent Neural Network Weight Matrices

Authors: Vincent Herrmann, Francesco Faccio, Jürgen Schmidhuber

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct empirical analyses and comparisons across the different encoder architectures using these datasets, showing which encoders are more effective.
Researcher Affiliation	Academia	1The Swiss AI Lab IDSIA, USI & SUPSI 2AI Initiative, KAUST.
Pseudocode	No	The paper describes methods and architectures but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	We release the first two model zoo datasets for RNN weight representation learning. One consists of generative models of a class of formal languages, and the other one of classifiers of sequentially processed MNIST digits. 1https://github.com/vincentherrmann/ rnn-weights-representation-learning
Open Datasets	Yes	To evaluate the methods described and foster further research, we develop and release two model zoo datasets for RNNs. ... 1https://github.com/vincentherrmann/ rnn-weights-representation-learning
Dataset Splits	Yes	The datasets are divided into training, validation, and out-of-distribution (OOD) test splits, with tasks in each split being non-overlapping.
Hardware Specification	Yes	We also thank NVIDIA Corporation for donating a DGX-1 as part of the Pioneers of AI Research Award.
Software Dependencies	No	The paper mentions the Adam W optimizer and a learning rate schedule, but it does not specify software versions for programming languages, libraries, or other dependencies needed to reproduce the experiments.
Experiment Setup	Yes	The hyperparameters of these encoders are selected to ensure a comparable number of parameters across all models. Each encoder generates a 16-dimensional representation z. An LSTM with two layers functions as the emulator Aξ. The conditioning of Aξ on an RNN fθ is implemented by incorporating a linear projection of the corresponding representation z to the BOS token of the input sequence of Aξ. More details and hyperparameters can be found in Appendix D.