Tracking the World State with Recurrent Entity Networks
Authors: Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, Yann LeCun
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we evaluate our model on three different datasets. Training details common to all experiments can be found in Appendix A. |
| Researcher Affiliation | Collaboration | 1Facebook AI Research 2Courant Institute, New York University |
| Pseudocode | No | The paper describes the model architecture and components in detail with equations, but it does not include a formal pseudocode block or algorithm listing. |
| Open Source Code | Yes | Code to reproduce these experiments can be found at https://github.com/facebook/Mem NN/tree/master/Ent Net-babi. |
| Open Datasets | Yes | We next evaluate our model on the b Ab I tasks... We used version 1.2 of the dataset with 10k samples. |
| Dataset Splits | Yes | for each task we conducted 10 runs with different initializations and picked the best model based on performance on the validation set |
| Hardware Specification | No | The paper does not provide specific details about the hardware used, such as GPU or CPU models. It only mentions software and training configurations. |
| Software Dependencies | No | All models were implemented using Torch (Collobert et al., 2011). While Torch is mentioned, a specific version number for the software is not provided. |
| Experiment Setup | Yes | All models were trained with ADAM using a learning rate of η = 0.01, which was divided by 2 every 25 epochs until 200 epochs were reached... our model had embedding dimension size d = 100 and 20 memory slots... All models were trained using standard stochastic gradient descent (SGD) with a fixed learning rate of 0.001... Optimization was done with SGD or ADAM using minibatches of size 32, and gradients with norm greater than 40 were clipped to 40. |