Tracking the World State with Recurrent Entity Networks

Authors: Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, Yann LeCun

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we evaluate our model on three different datasets. Training details common to all experiments can be found in Appendix A.
Researcher Affiliation Collaboration 1Facebook AI Research 2Courant Institute, New York University
Pseudocode No The paper describes the model architecture and components in detail with equations, but it does not include a formal pseudocode block or algorithm listing.
Open Source Code Yes Code to reproduce these experiments can be found at https://github.com/facebook/Mem NN/tree/master/Ent Net-babi.
Open Datasets Yes We next evaluate our model on the b Ab I tasks... We used version 1.2 of the dataset with 10k samples.
Dataset Splits Yes for each task we conducted 10 runs with different initializations and picked the best model based on performance on the validation set
Hardware Specification No The paper does not provide specific details about the hardware used, such as GPU or CPU models. It only mentions software and training configurations.
Software Dependencies No All models were implemented using Torch (Collobert et al., 2011). While Torch is mentioned, a specific version number for the software is not provided.
Experiment Setup Yes All models were trained with ADAM using a learning rate of η = 0.01, which was divided by 2 every 25 epochs until 200 epochs were reached... our model had embedding dimension size d = 100 and 20 memory slots... All models were trained using standard stochastic gradient descent (SGD) with a fixed learning rate of 0.001... Optimization was done with SGD or ADAM using minibatches of size 32, and gradients with norm greater than 40 were clipped to 40.