Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Grounding Language to Autonomously-Acquired Skills via Goal Generation
Authors: Ahmed Akakzia, Cédric Colas, Pierre-Yves Oudeyer, Mohamed CHETOUANI, Olivier Sigaud
ICLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental section investigates three questions: how does DECSTR perform in the three phases? How does it compare to end-to-end LC-RL approaches? Do we need intermediate representations to be semantic? |
| Researcher Affiliation | Academia | Ahmed Akakzia Sorbonne Universit e EMAIL C edric Colas Inria EMAIL Pierre-Yves Oudeyer Inria Mohamed Chetouani Sorbonne Universit e Olivier Sigaud Sorbonne Universit e |
| Pseudocode | Yes | Algorithm 1 and 2 present the high-level pseudo-code of any algorithm following the LGB architecture for each of the three phases. |
| Open Source Code | Yes | Code and videos can be found at https://sites.google.com/view/decstr/. |
| Open Datasets | No | A training dataset is collected via interactions between a DECSTR agent trained in phase G B and a social partner. DECSTR generates semantic goals and pursues them. For each trajectory, the social partner provides a description d of one change in objects relations from the initial configuration ci to the final one cf. The set of possible descriptions contains 102 sentences, each describing, in a simplified language, a positive or negative shift for one of the 9 predicates (e.g. get red above green). This leads to a dataset D of 5000 triplets: (ci, d, cf). |
| Dataset Splits | No | The paper describes a 'training dataset D' and an 'oracle dataset O' used for evaluation of the LGG, but does not specify explicit train/validation/test splits or percentages for the overall experiments or for the main model training. |
| Hardware Specification | No | This work was performed using HPC resources from GENCI-IDRIS (Grant 20XX-AP010611667), the Me SU platform at Sorbonne-Universit e and the Pla FRIM experimental testbed. ... Each run leverages 24 cpus (24 actors) for about 72h for a total of 9.8 cpu years. Experiments presented in this paper requires machines with at least 24 cpu cores. |
| Software Dependencies | No | The paper mentions software like SAC, HER, and Adam optimizers but does not provide specific version numbers for any software components. |
| Experiment Setup | Yes | Implementation details and hyperparameters can be found in Appendix C. ... Table 4: Sensorimotor learning hyperparameters used in DECSTR. |