Memory-Augmented Theory of Mind Network

Authors: Dung Nguyen, Phuoc Nguyen, Hung Le, Kien Do, Svetha Venkatesh, Truyen Tran

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate To MMY on multiple tasks, including predicting preference, intention, action, and successor representations as well as assessing false belief understanding. For simplicity, we set the memory keys to the forward LSTM states (kt j = h t j of Eq. (1)). Similarly, the memory values are set to either the forward LSTM states (vt j = h t j of Eq. (1)), or the concatenation of both forward and backward LSTM states (vt j = h h t j, h t j i of Eq. (1) and Eq. (2)). The latter is called Bi-To MMY. The number of queries is set as M = 10. In practice, some actions (such as pick-up) happen far less frequently than others in the whole sequence. Thus we use a replay buffer to balance the class of actions in training. As the relay buffer plays the role of episodic memory in the learning process, we call this balancing strategy action-based episodic memory (AEM). For comparison, we implemented a recent representative neural To M network called To Mnet (Rabinowitz et al. 2018).
Researcher Affiliation Academia Applied Artificial Intelligence Institute (A2I2), Deakin University, Geelong, Australia {dung.nguyen,phuoc.nguyen,thai.le,k.do,svetha.venkatesh,truyen.tran}@deakin.edu.au
Pseudocode No The paper describes its methods using text and mathematical equations, but it does not contain a distinct block or section explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code No The paper does not provide a direct link to a source-code repository or an explicit statement about releasing the code for the described methodology.
Open Datasets Yes To study To M models, we created a multi-light-room environment using the gym-minigrid framework (Chevalier Boisvert, Willems, and Pal 2018) (see Fig. 2).
Dataset Splits No The paper mentions 'We trained To M models in episodes with three light-rooms and test models under different conditions', but it does not provide specific details on training/validation/test dataset splits (percentages or counts) nor does it explicitly mention a validation set.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions using the 'gym-minigrid framework', but it does not specify version numbers for this or any other software dependencies, which are necessary for reproducibility.
Experiment Setup Yes For simplicity, we set the memory keys to the forward LSTM states (kt j = h t j of Eq. (1)). Similarly, the memory values are set to either the forward LSTM states (vt j = h t j of Eq. (1)), or the concatenation of both forward and backward LSTM states (vt j = h h t j, h t j i of Eq. (1) and Eq. (2)). The latter is called Bi-To MMY. The number of queries is set as M = 10.