Learning to Guide and to be Guided in the Architect-Builder Problem

Authors: Paul Barde, Tristan Karch, Derek Nowrouzezahrai, Clément Moulin-Frier, Christopher Pal, Pierre-Yves Oudeyer

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We analyze the key learning mechanisms of ABIG and test it in a 2-dimensional instantiation of the ABP where tasks involve grasping cubes, placing them at a given location, or building various shapes.
Researcher Affiliation Collaboration Paul Barde Québec AI institute (Mila) Mc Gill University... Tristan Karch Inria Flowers team Universit e de Bordeaux... Christopher Pal Qu ebec AI institute (Mila) Polythechnique Montr eal Service Now Element AI... Pierre-Yves Oudeyer Inria Flowers team Univ. Bordeaux Microsoft Research Montreal
Pseudocode Yes The algorithm is illustrated in Figure 3 and the pseudo-code is reported in Algorithm 1 in Suppl. Section A.3.
Open Source Code Yes We ensure the reproducibility of the experiments presented in this work by providing our code3. 3https://github.com/flowersteam/architect-builder-abig.git
Open Datasets No The paper describes 'Build World' as a custom 2D construction gridworld environment where experiments are conducted and data is generated through agent interaction. It does not provide a link, DOI, or specific citation for this 'dataset' as a pre-collected, publicly available resource.
Dataset Splits Yes The data-set is split into training (70%) and validation (30%) sets.
Hardware Specification Yes A complete ABIG training can take up to 48 hours on a single modern CPU (Intel E5-2683 v4 Broadwell @ 2.1GHz).
Software Dependencies No The paper mentions 'ReLu networks' and 'Adam optimizer' but does not provide specific version numbers for software dependencies or programming languages used in the implementation.
Experiment Setup Yes All models are parametrized by two-hidden layer 126-units feedforward Re Lu networks. BC minimizes the cross-entropy loss with Adam optimizer (Kingma & Ba, 2015). Tables 1-5 provide detailed hyper-parameters for toy experiments, MCTS, Build World, and BC for both architect and builder.