Understanding Domain Randomization for Sim-to-real Transfer
Authors: Xiaoyu Chen, Jiachen Hu, Chi Jin, Lihong Li, Liwei Wang
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Despite its empirical successes, theoretical understanding on why this simple algorithm works is limited. In this paper, we propose a theoretical framework for sim-to-real transfers, in which the simulator is modeled as a set of MDPs with tunable parameters (corresponding to unknown physical parameters such as friction). |
| Researcher Affiliation | Collaboration | Xiaoyu Chen & Jiachen Hu Key Laboratory of Machine Perception, MOE, School of Artificial Intelligence, Peking University {cxy30, Nick H}@pku.edu.cn Chi Jin Department of Electrical and Computer Engineering, Princeton University chij@princeton.edu Lihong Li Amazon llh@amazon.com Liwei Wang Key Laboratory of Machine Perception, MOE, School of Artificial Intelligence, Peking University International Center for Machine Learning Research, Peking University wanglw@cis.pku.edu.cn |
| Pseudocode | Yes | As a byproduct of our proof, we provide the first provably efficient model-based algorithm for learning infinite-horizon average-reward MDPs with general function approximation (Algorithm 4 in Appendix C.3). |
| Open Source Code | No | The paper does not include any explicit statements about providing open-source code or links to a code repository. |
| Open Datasets | No | The paper is theoretical and does not report empirical experiments with specific datasets. It discusses 'training phase' conceptually within its theoretical framework but does not refer to public datasets used for training. |
| Dataset Splits | No | The paper is theoretical and does not report empirical experiments; therefore, it does not provide details on training/test/validation dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not discuss the hardware used for any experiments. |
| Software Dependencies | No | The paper is theoretical and focuses on mathematical proofs and algorithm design. It does not mention any specific software dependencies or versions required to replicate an empirical study. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with concrete hyperparameter values or training configurations. |