reproducibilityindex.ai

Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines

Authors: Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Pushkar Shukla, Sadhana Kumaravel, Gerald Tesauro, Kartik Talamadupula, Mrinmaya Sachan, Murray Campbell9018-9027

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we report the results of our experiments on the TWC games.
Researcher Affiliation	Collaboration	1IBM Research 2EPFL 3TTI Chicago 4ETH Zurich
Pseudocode	No	The paper describes the components of the framework in prose and with a block diagram (Figure 3) but does not provide structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code and data can be found at https://github.com/IBM/commonsense-rl.
Open Datasets	Yes	Code and data can be found at https://github.com/IBM/commonsense-rl.
Dataset Splits	No	The paper specifies a 'training set and two test sets' (in-distribution and out-of-distribution) but does not explicitly mention a separate 'validation' set for model training.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models or types of computing resources used for the experiments.
Software Dependencies	No	The paper mentions various software components and models like GloVe, Numberbatch, BERT, GPT2, and spaCy, but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	Each agent is trained for 100 episodes and the results are averaged over 10 runs. Following one of the winning strategies in the First Text World Competition (Adolphs and Hofmann 2019), we use the Advantage Actor-Critic framework (Mnih et al. 2016) to train the agents using reward signals from the training games.