reproducibilityindex.ai

Grounding Language to Autonomously-Acquired Skills via Goal Generation

Authors: Ahmed Akakzia, Cédric Colas, Pierre-Yves Oudeyer, Mohamed CHETOUANI, Olivier Sigaud

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental section investigates three questions: how does DECSTR perform in the three phases? How does it compare to end-to-end LC-RL approaches? Do we need intermediate representations to be semantic?
Researcher Affiliation	Academia	Ahmed Akakzia Sorbonne Universit e ahmed.akakzia@isir.upmc.fr C edric Colas Inria cedric.colas@inria.fr Pierre-Yves Oudeyer Inria Mohamed Chetouani Sorbonne Universit e Olivier Sigaud Sorbonne Universit e
Pseudocode	Yes	Algorithm 1 and 2 present the high-level pseudo-code of any algorithm following the LGB architecture for each of the three phases.
Open Source Code	Yes	Code and videos can be found at https://sites.google.com/view/decstr/.
Open Datasets	No	A training dataset is collected via interactions between a DECSTR agent trained in phase G B and a social partner. DECSTR generates semantic goals and pursues them. For each trajectory, the social partner provides a description d of one change in objects relations from the initial conﬁguration ci to the ﬁnal one cf. The set of possible descriptions contains 102 sentences, each describing, in a simpliﬁed language, a positive or negative shift for one of the 9 predicates (e.g. get red above green). This leads to a dataset D of 5000 triplets: (ci, d, cf).
Dataset Splits	No	The paper describes a 'training dataset D' and an 'oracle dataset O' used for evaluation of the LGG, but does not specify explicit train/validation/test splits or percentages for the overall experiments or for the main model training.
Hardware Specification	No	This work was performed using HPC resources from GENCI-IDRIS (Grant 20XX-AP010611667), the Me SU platform at Sorbonne-Universit e and the Pla FRIM experimental testbed. ... Each run leverages 24 cpus (24 actors) for about 72h for a total of 9.8 cpu years. Experiments presented in this paper requires machines with at least 24 cpu cores.
Software Dependencies	No	The paper mentions software like SAC, HER, and Adam optimizers but does not provide specific version numbers for any software components.
Experiment Setup	Yes	Implementation details and hyperparameters can be found in Appendix C. ... Table 4: Sensorimotor learning hyperparameters used in DECSTR.