Data Augmentation for Learning to Play in Text-Based Games

Authors: Jinhyeon Kim, Kee-Eung Kim

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experiment on top of the state-of-the-art RL algorithms for text-based games, and show that our data augmentation techniques further improve their generalization performance. Moreover, we observe that even the simple DQN with a straightforward pre-processing of observations can achieve state-of-the-art performance in the Text World s Cooking Game benchmark [Cˆot e et al., 2018; Adhikari et al., 2020].
Researcher Affiliation Collaboration Jinhyeon Kim1,2 and Kee-Eung Kim1 1 Kim Jaechul Graduate School of AI, KAIST 2 Skelter Labs
Pseudocode Yes Algorithm 1 describes the pseudo-code for applying relabeling as data augmentation for learning to play text-based games.
Open Source Code Yes The code is publicly available at https://github.com/ KAIST-AILab/transition-matching-permutation.
Open Datasets Yes In particular, we use the Cooking Game suite, which is one of the standard sets of games provided by Text World [Cˆot e et al., 2018].
Dataset Splits Yes They split the processing methods (i.e. cooking and cutting) for each ingredient into training, validation, and test set. ... For each trial, we select the best model in terms of the validation set performance and report its test set performance.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'SpaCy' but does not provide a specific version number for it or any other software dependencies crucial for replication.
Experiment Setup Yes We train the agents for 300,000 episodes with a linear decay on the learning rate from 10 3 to 10 6. We use ϵ-greedy with the values of ϵ annealed from 1.0 to 0.1 over 200,000 episodes. We use the batch of size 256 for the parameter update. The update takes place after each game step of the batch environment interaction of size 64.