Exploration Based Language Learning for Text-Based Games

Authors: Andrea Madotto, Mahdi Namazifar, Joost Huizinga, Piero Molino, Adrien Ecoffet, Huaixiu Zheng, Alexandros Papangelis, Dian Yu, Chandra Khatri, Gokhan Tur

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that this approach outperforms existing solutions in solving text-based games, and it is more sample efficient in terms of number of interactions with the environment.
Researcher Affiliation Collaboration Andrea Madotto1 , Mahdi Namazifar2 , Joost Huizinga2 , Piero Molino2 , Adrien Ecoffet2 , Huaixiu Zheng3 , Dian Yu4 , Alexandros Papangelis2 , Chandra Khatri2 , Gokhan Tur5 1The Hong Kong University Of Science and Technology 2Uber AI, 3Google Brain, 4UC-Davis, 5Amazon Alexa AI
Pseudocode No The paper describes its methods through prose and mathematical equations but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper mentions open-source code for a baseline ('The authors released their code at https://github.com/xingdieric-yuan/Text World-Coin-Collector'), but does not provide a link or statement about the availability of the authors' own code for the proposed methodology.
Open Datasets Yes Coin Collector [Yuan et al., 2018] is a class of text-based games... Cooking World [Cˆot e, 2018] in this challenge, there are 4,440 games...
Dataset Splits Yes Zero-Shot: split the games into training, validation, and test sets, and then train our policy on the training games and test it on the unseen test games.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU models, or memory specifications used for running its experiments.
Software Dependencies No The paper mentions 'pre-trained GloVe of dimension 100 for the single setting and 300 for the joint one' and 'LSTM', but it does not specify software dependencies with version numbers (e.g., Python version, library versions like PyTorch, TensorFlow, or specific solver versions).
Experiment Setup Yes In all the games the maximum number of steps has been set to 50. As mentioned earlier, the cell representation used in the Go-Explore archive is computed as the binning of the sum of embeddings of the room description tokens concatenated with the current cumulative reward. The sum of embeddings is computed using 50 dimensional pre-trained Glo Ve [Pennington et al., 2014] vectors. In the Coin Collector baselines we use the same hyper-parameters as in the original paper. In Cooking World all the baselines use pre-trained Glo Ve of dimension 100 for the single setting and 300 for the joint one. The LSTM hidden state has been set to 300 for all the models.