Data Augmentation for Learning to Play in Text-Based Games
Authors: Jinhyeon Kim, Kee-Eung Kim
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experiment on top of the state-of-the-art RL algorithms for text-based games, and show that our data augmentation techniques further improve their generalization performance. Moreover, we observe that even the simple DQN with a straightforward pre-processing of observations can achieve state-of-the-art performance in the Text World s Cooking Game benchmark [Cˆot e et al., 2018; Adhikari et al., 2020]. |
| Researcher Affiliation | Collaboration | Jinhyeon Kim1,2 and Kee-Eung Kim1 1 Kim Jaechul Graduate School of AI, KAIST 2 Skelter Labs |
| Pseudocode | Yes | Algorithm 1 describes the pseudo-code for applying relabeling as data augmentation for learning to play text-based games. |
| Open Source Code | Yes | The code is publicly available at https://github.com/ KAIST-AILab/transition-matching-permutation. |
| Open Datasets | Yes | In particular, we use the Cooking Game suite, which is one of the standard sets of games provided by Text World [Cˆot e et al., 2018]. |
| Dataset Splits | Yes | They split the processing methods (i.e. cooking and cutting) for each ingredient into training, validation, and test set. ... For each trial, we select the best model in terms of the validation set performance and report its test set performance. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'SpaCy' but does not provide a specific version number for it or any other software dependencies crucial for replication. |
| Experiment Setup | Yes | We train the agents for 300,000 episodes with a linear decay on the learning rate from 10 3 to 10 6. We use ϵ-greedy with the values of ϵ annealed from 1.0 to 0.1 over 200,000 episodes. We use the batch of size 256 for the parameter update. The update takes place after each game step of the batch environment interaction of size 64. |