CrowdPlay: Crowdsourcing Human Demonstrations for Offline Learning

Authors: Matthias Gerstgrasser, Rakshit Trivedi, David C. Parkes

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we aim to fill a gap at the intersection of these two: The use of crowdsourcing to generate largescale human demonstration data in the support of advancing research into imitation learning and offline learning. To this end, we present Crowd Play, a complete crowdsourcing pipeline for any standard RL environment including Open AI Gym (made available under an open-source license); a large-scale publicly available crowdsourced dataset of human gameplay demonstrations in Atari 2600 games, including multimodal behavior and human-human and human-AI multiagent data; offline learning benchmarks with extensive human data evaluation; and a detailed study of incentives, including real-time feedback to drive high quality data.
Researcher Affiliation Academia Matthias Gerstgrasser, Rakshit Trivedi & David C. Parkes School of Engineering and Applied Sciences Harvard University {matthias,rstrivedi,parkes}@seas.harvard.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code and dataset are available at https://mgerstgrasser.github.io/crowdplay/.
Open Datasets Yes Our code and dataset are available at https://mgerstgrasser.github.io/crowdplay/. ... a large-scale publicly available crowdsourced dataset of human gameplay demonstrations in Atari 2600 games
Dataset Splits No The paper does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or counts) for the experiments conducted. It mentions data processing steps like downsampling but not data partitioning for model training and evaluation.
Hardware Specification No The paper mentions deployment on "Amazon Elastic Beanstalk (EB)" and implies the use of "multiple processor cores" and "AWS Spot Instances" for scalability. However, it does not specify any concrete hardware details such as exact GPU/CPU models, specific memory amounts, or detailed computer specifications used for running the experiments or training models.
Software Dependencies Yes For the analysis performed in this paper, the t-SNE embeddings were generated using scikit-learn version 1.0.1... We performed all the experiments using the open source implementations of the algorithms that are provided as part of d3rlpy2 library. ...We used Ray / RLLib version 1.4.0, and used the default hyperparameters for A2C therein.
Experiment Setup Yes Table 7: Hyper-Parameter Configuration Table. This table explicitly lists hyperparameters for various algorithms and components, including learning rates, batch sizes, gamma, etc., providing specific values for the experimental setup.