Memory Augmented Control Networks

Authors: Arbaaz Khan, Clark Zhang, Nikolay Atanasov, Konstantinos Karydis, Vijay Kumar, Daniel D. Lee

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To investigate the performance of MACN, we design our experiments to answer three key questions:" and "We first demonstrate that MACN can learn how to plan in a 2D grid world environment." and "We present our results in Table 1.
Researcher Affiliation Academia Arbaaz Khan, Clark Zhang, Nikolay Atanasov, Konstantinos Karydis, Vijay Kumar, Daniel D. Lee GRASP Laboratory, University of Pennsylvania
Pseudocode No The paper describes the computational steps and architecture in narrative text and figures, but does not include a formally structured pseudocode or algorithm block.
Open Source Code No The paper does not contain any explicit statement about making the source code publicly available or provide a link to a code repository.
Open Datasets No For the grid world with simple obstacles, we observe that the MACN performs better when trained with curriculum (Bengio et al., 2009). This is expected since both the original VIN paper and the DNC paper show that better results are achieved when trained with curriculum. For establishing baselines, the VIN and the CNN+Memory models are also trained with curriculum learning. In the grid world environment it is easy to define tasks that are harder than other tasks to aid with curriculum training. For a grid world with size (m, n) we increase the difficulty of the task by increasing the number of obstacles and the maximum size of the obstacles. Thus, for a 32 32 grid, we start with a maximum of 2 obstacles and the maximum size being 2 2. Both parameters are then increased gradually. The optimal action in the grid world experiments is generated by A star (Russell & Norvig, 2003)." The paper describes how environments are *generated* or *defined* for experiments, but does not provide concrete access (link, DOI, specific citation with author/year for a public dataset) to a dataset used for training.
Dataset Splits No The paper mentions training and test sets but does not explicitly describe a validation set or its split information.
Hardware Specification No The paper does not provide any specific hardware details such as GPU or CPU models, memory, or cloud computing instance types used for running the experiments.
Software Dependencies No The paper mentions software components and algorithms like RMSProp, LSTM, DQN, and A3C, but does not specify version numbers for any libraries, frameworks, or programming languages used.
Experiment Setup Yes The optimizer used is the RMSProp and we use a learning rate of 0.0001 for our experiments." and "The external memory has 32 slots with word size 8 and we use 4 read heads and 1 write head." and "The input image of dimension [m n 2] is first convolved with a kernel of size (3 3), 150 channels and stride of 1 everywhere." and "The K (parameter corresponding to number of iterations of value iteration) here is 40. The network controller is a LSTM with 512 hidden units and the external memory has 1024 rows and a word size of 512. We use 16 write heads and 4 read heads in the access module.