Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Self-Attentive Associative Memory
Authors: Hung Le, Truyen Tran, Svetha Venkatesh
ICML 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We achieve competitive results with our proposed two-memory model in a diversity of machine learning tasks, from challenging synthetic problems to practical testbeds such as geometry, graph, reinforcement learning, and question answering. |
| Researcher Affiliation | Academia | Hung Le 1 Truyen Tran 1 Svetha Venkatesh 1 1Applied AI Institute, Deakin University, Geelong, Australia. Correspondence to: Hung Le <EMAIL>. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our source code is available at https://github.com/ thaihungle/SAM. |
| Open Datasets | Yes | We test different model configurations on two classical tasks for sequential and relational learning: associative retrieval (Ba et al., 2016a) and N th-farthest (Santoro et al., 2018); Algorithmic synthetic tasks (Graves et al., 2014); Convex hull, Traveling salesman problem (TSP) from Vinyals et al. (2015); b Ab I is a question answering dataset (Weston et al., 2015); We apply our memory to LSTM agents in Atari game environment using A3C training (Mnih et al., 2016). |
| Dataset Splits | No | The paper mentions training and testing but does not provide specific dataset split information (exact percentages, sample counts, or explicit predefined splits) for validation purposes. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only mentioning general environments like "Atari game environment". |
| Software Dependencies | No | The paper mentions the use of an "Adam" optimizer but does not specify software names with version numbers (e.g., Python 3.x, PyTorch x.x.x). |
| Experiment Setup | Yes | We run our STM with different nq = 1, 4, 8 using the same problem setting (8 16-dimensional input vectors), optimizer (Adam), batch size (1600) as in Santoro et al. (2018). We evaluate our model STM (nq = 8, d = 96) with the 4 following baselines: LSTM (Hochreiter & Schmidhuber, 1997), attentional LSTM (Bahdanau et al., 2015), NTM (Graves et al., 2014) and RMC (Santoro et al., 2018). We ablate our STM (d = 96, full features) by creating three other versions: small STM with transfer (d = 48), small STM without transfer (d = 48, w/o transfer) and STM without gates (d = 96, w/o gates). nq is fixed to 1 as the task does not require much relational learning. |