Modular Robot Design Synthesis with Deep Reinforcement Learning

Authors: Julian Whitman, Raunaq Bhirangi, Matthew Travers, Howie Choset10418-10425

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that our algorithm is more computationally efficient in determining robot designs for given tasks in comparison to the current state-of-the-art. ... Section 4 presents our results and benchmarks them against existing approaches.
Researcher Affiliation Academia 1Department of Mechanical Engineering, Carnegie Mellon University 2The Robotics Institute, Carnegie Mellon University 5000 Forbes Ave., Pittsburgh, Pennsylvania 15213 jwhitman@cmu.edu
Pseudocode Yes Algorithm 1: Manipulator arrangement search, a best-first search guided by the output of a DQN.
Open Source Code Yes We have included the code to train the network, pretrained network weights, and the code and results for our experiments in the supplementary material.
Open Datasets No The paper describes generating random targets and obstacles for training and testing, and uses modular components from Hebi Robotics. However, it does not provide access information (link, DOI, formal citation) to a specific, publicly available or open dataset that was used or created for this research.
Dataset Splits No No explicit train/test/validation split percentages or sample counts are provided. The paper mentions "test points" but does not define the split methodology for the overall data generation process.
Hardware Specification Yes We trained the DQN and conducted all tests on a desktop computer with Ubuntu 16.04, Intel i5 four-core processor at 3.5 GHz, and an NVIDIA GTX 1050 graphics card.
Software Dependencies No The paper mentions "Ubuntu 16.04" for the operating system but does not specify versions for any other ancillary software components like programming languages (e.g., Python), libraries (e.g., PyTorch, TensorFlow), or solvers.
Experiment Setup Yes During training and all tests, we set the objective weights w J = 0.025, w M = 0.1. ... We trained the DQN for 450,000 episodes (about 33 hours) before using it within our algorithm. ... We trained this DQN for 700,000 episodes (about 57 hours).