reproducibilityindex.ai

Efficient Differentiable Simulation of Articulated Bodies

Authors: Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, Ming C Lin

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the utility of efﬁcient differentiable dynamics for articulated bodies in a variety of applications. We show that reinforcement learning with articulated systems can be accelerated using gradients provided by our method. In applications to control and inverse problems, gradient-based optimization enabled by our work accelerates convergence by more than an order of magnitude. For experiments, we will ﬁrst scale the complexity of simulated scenes and compare the performance of our method with autodiff tools. Then we will integrate differentiable physics into reinforcement learning and use our method to learn control policies. Lastly, we will apply differentiable articulated dynamics to solve motion control and parameter estimation problems. We compare our method with state-of-the-art autodiff tools, including Cpp AD (Bell et al., 2018), Ceres (Agarwal et al., 2010), Py Torch (Paszke et al., 2019), autodiff (Leal et al., 2018) (referred to as ADF to avoid ambiguity), and JAX (Bradbury et al., 2018). All our experiments are performed on a desktop with an Intel(R) Xeon(R) W-2123 CPU @ 3.60GHz with 32GB of memory. We trained each model for 100n epochs... We trained each model for 100n epochs, where n is the number of links. The results are shown in Figure 4. In Figure 4(b), we report the relative reward of each task... Figure 5(b) shows the reward over time.
Researcher Affiliation	Collaboration	1University of Maryland, College Park 2Intel Labs.
Pseudocode	No	The paper does not contain explicitly labeled pseudocode or algorithm blocks. It describes the workflow and provides mathematical derivations, but not structured pseudocode.
Open Source Code	Yes	Code is available on our project page: https://github.com/YilingQiao/diffarticulated
Open Datasets	Yes	We test our policy enhancement method in a simple scenario, where an n-link pendulum needs to reach a target point... Next, we test our sample enhancement method on the Mu Jo Co Ant. In this scenario, a four-legged robot on the ground needs to learn to walk towards a ﬁxed heading (Figure 5(a)). The scenario is the same as the standard task deﬁned in Mu Jo Co, except that the simulator is replaced by ours. We also compare with other methods (SAC (Haarnoja et al., 2018), SQL (Haarnoja et al., 2017), and PPO (Schulman et al., 2017) implemented in Ray RLlib (Liang et al., 2018)) for reference.
Dataset Splits	No	No specific information about training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit mention of a validation set) was found in the paper for the experiments conducted. The paper mentions training for a number of epochs but not how the data was split.
Hardware Specification	Yes	All our experiments are performed on a desktop with an Intel(R) Xeon(R) W-2123 CPU @ 3.60GHz with 32GB of memory.
Software Dependencies	No	The paper mentions software like 'Cpp AD', 'Ceres', 'Py Torch', 'autodiff' (ADF), 'JAX', and 'Ray RLlib' for comparison or implementation, but it does not specify exact version numbers for these software dependencies, which are necessary for reproducibility.
Experiment Setup	Yes	All networks use the Py Torch default initialization scheme. We trained each model for 100n epochs, where n is the number of links. For every true sample we get from the simulator, we generate 9 extra samples around it using our sample enhancement method.