Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills

Authors: Victor Campos, Alexander Trott, Caiming Xiong, Richard Socher, Xavier Giro-I-Nieto, Jordi Torres

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform an extensive evaluation of skill discovery methods on controlled environments and show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned.
Researcher Affiliation Collaboration 1Barcelona Supercomputing Center 2Salesforce Research 3Universitat Polit ecnica de Catalunya. Correspondence to: V ıctor Campos <victor.campos@bsc.es>.
Pseudocode No The paper describes algorithms and methods conceptually and mathematically but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code is publicly available at https: //github.com/victorcampos7/edl.
Open Datasets No The paper uses 'controlled synthetic environments' which are custom-built 2D mazes. It states: 'In the controlled environments considered in this work, this can be achieved by sampling states from an oracle.' No specific access information (link, DOI, citation to a public dataset) for these environments or data generated from them is provided.
Dataset Splits No The paper mentions 'controlled synthetic environments' and discusses experimental results, but it does not specify exact train/validation/test splits by percentage or sample count. It refers to 'a detailed description of the experimental setup and the hyperparameters' in the Supplementary Material, but this information is not in the main text.
Hardware Specification No The paper mentions 'hardware accelerators (NVIDIA, 2017; Jouppi et al., 2017)' in the introduction in a general context of RL advancements, but does not provide any specific hardware details (e.g., GPU models, CPU types, memory) used for their own experiments.
Software Dependencies No The paper mentions using specific models like VQ-VAE and Sibling Rivalry, but it does not provide specific version numbers for any software, libraries, or dependencies used in their experiments (e.g., 'PyTorch 1.9' or 'Python 3.8').
Experiment Setup No The paper states: 'We refer the reader to the SM for a detailed description of the experimental setup and the hyperparameters.' This indicates that specific experimental setup details, including concrete hyperparameter values or training configurations, are deferred to the supplementary material and are not present in the main text.