Task-Agnostic Morphology Evolution

Authors: Donald Joseph Hejna III, Pieter Abbeel, Lerrel Pinto

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we empirically demonstrate that across 2D, 3D, and manipulation environments TAME can evolve morphologies that match the multi-task performance of those learned with task supervised algorithms.
Researcher Affiliation Academia Donald J. Hejna III UC Berkeley jhejna@berkeley.edu Pieter Abbeel UC Berkeley pabbeel@berkeley.edu Lerrel Pinto New York University lerrel@cs.nyu.edu
Pseudocode Yes Algorithm 1: TAME Init. qφ(aj|s T , m) and population P; for i = 1, 2, ..., N do for j = 1, 2..., L do m mutation from P; for k = 1, 2, ..., E do sample joint actions a; s T end State(a, m); P P {(m, )}; φ train(qφ, P); for (m, f) in P do f update via equation 1; return arg max(m,f) P f
Open Source Code Yes Our code and videos can be found at https://sites.google.com/view/task-agnostic-evolution.
Open Datasets No Note that since there are no standard multi-task morphology optimization benchmarks, we created our own morphology representation and set of multi-task environments that will be publicly released. The paper does not specify a pre-existing public dataset for training, but rather generates data within custom environments that are stated to be publicly released with their code.
Dataset Splits No The paper does not provide specific training/validation/test dataset splits, as it generates data dynamically through simulation for policy training and evaluation rather than using a static pre-split dataset.
Hardware Specification No The paper states 'We thank AWS for computing resources.' and mentions 'TAME running on two CPU cores' and 'NGE-Like algorithm running on eight CPU cores' in Table 2, but does not provide specific hardware models like CPU or GPU types.
Software Dependencies No The paper mentions 'Pytorch Geometric software package' and 'Stable-Baselines 3' but does not specify their version numbers.
Experiment Setup Yes Further experiment details can be found in Appendix G. Table 6 lists PPO Hyperparameters (e.g., 'Discount 0.99', 'batch size 128', 'learning rate 0.0003') and Table 7 lists Evolution hyperparameters (e.g., 'generations 60', 'population size 24', 'learning rate 0.001').