Efficient Differentiable Simulation of Articulated Bodies
Authors: Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, Ming C Lin
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the utility of efficient differentiable dynamics for articulated bodies in a variety of applications. We show that reinforcement learning with articulated systems can be accelerated using gradients provided by our method. In applications to control and inverse problems, gradient-based optimization enabled by our work accelerates convergence by more than an order of magnitude. For experiments, we will first scale the complexity of simulated scenes and compare the performance of our method with autodiff tools. Then we will integrate differentiable physics into reinforcement learning and use our method to learn control policies. Lastly, we will apply differentiable articulated dynamics to solve motion control and parameter estimation problems. We compare our method with state-of-the-art autodiff tools, including Cpp AD (Bell et al., 2018), Ceres (Agarwal et al., 2010), Py Torch (Paszke et al., 2019), autodiff (Leal et al., 2018) (referred to as ADF to avoid ambiguity), and JAX (Bradbury et al., 2018). All our experiments are performed on a desktop with an Intel(R) Xeon(R) W-2123 CPU @ 3.60GHz with 32GB of memory. We trained each model for 100n epochs... We trained each model for 100n epochs, where n is the number of links. The results are shown in Figure 4. In Figure 4(b), we report the relative reward of each task... Figure 5(b) shows the reward over time. |
| Researcher Affiliation | Collaboration | 1University of Maryland, College Park 2Intel Labs. |
| Pseudocode | No | The paper does not contain explicitly labeled pseudocode or algorithm blocks. It describes the workflow and provides mathematical derivations, but not structured pseudocode. |
| Open Source Code | Yes | Code is available on our project page: https://github.com/YilingQiao/diffarticulated |
| Open Datasets | Yes | We test our policy enhancement method in a simple scenario, where an n-link pendulum needs to reach a target point... Next, we test our sample enhancement method on the Mu Jo Co Ant. In this scenario, a four-legged robot on the ground needs to learn to walk towards a fixed heading (Figure 5(a)). The scenario is the same as the standard task defined in Mu Jo Co, except that the simulator is replaced by ours. We also compare with other methods (SAC (Haarnoja et al., 2018), SQL (Haarnoja et al., 2017), and PPO (Schulman et al., 2017) implemented in Ray RLlib (Liang et al., 2018)) for reference. |
| Dataset Splits | No | No specific information about training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit mention of a validation set) was found in the paper for the experiments conducted. The paper mentions training for a number of epochs but not how the data was split. |
| Hardware Specification | Yes | All our experiments are performed on a desktop with an Intel(R) Xeon(R) W-2123 CPU @ 3.60GHz with 32GB of memory. |
| Software Dependencies | No | The paper mentions software like 'Cpp AD', 'Ceres', 'Py Torch', 'autodiff' (ADF), 'JAX', and 'Ray RLlib' for comparison or implementation, but it does not specify exact version numbers for these software dependencies, which are necessary for reproducibility. |
| Experiment Setup | Yes | All networks use the Py Torch default initialization scheme. We trained each model for 100n epochs, where n is the number of links. For every true sample we get from the simulator, we generate 9 extra samples around it using our sample enhancement method. |