reproducibilityindex.ai

Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics

Authors: Shenao Zhang, Wanxin Jin, Zhaoran Wang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on various robotic tasks are provided to support our theory and method.
Researcher Affiliation	Academia	1Northwestern University, Evanston, IL, USA 2Arizona State University, Tempe, AZ, USA.
Pseudocode	Yes	We provide the pseudocode of Adaptive Barrier Smoothing in Algorithm 1. By adopting ABS to compute the policy gradient in (4.1) and following Algorithm 2 as the main training loop, we obtain the FOPG-ABS method.
Open Source Code	No	The paper does not contain any explicit statement about making its source code open, nor does it provide a link to a code repository.
Open Datasets	No	The paper mentions running experiments in the 'Dojo (Howell et al., 2022) physics engine' and discusses 'Dojo locomotion tasks'. It also refers to collecting an 'evaluation transition dataset' in Section 8.4. However, it does not provide concrete access information (e.g., a direct link, DOI, or specific citation to a public dataset with access details) for any dataset used for training or evaluation.
Dataset Splits	No	The paper does not specify exact training, validation, or test dataset splits (e.g., percentages or counts). It mentions using an 'evaluation transition dataset' but no specific split details are provided.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory specifications, or cloud computing instances) used for running the experiments.
Software Dependencies	No	The paper mentions using the 'Dojo (Howell et al., 2022) physics engine' and implies the use of 'Python' in Appendix C.1. However, it does not provide specific version numbers for Dojo, Python, or any other critical software libraries or dependencies, which would be necessary for reproduction.
Experiment Setup	Yes	In our Dojo experiments in Section 8.3, we use a contact-aware central-path parameter for the proposed Adaptive Barrier Smoothing method. From the results in Figure 3, to balance the gradient variance and bias, we would like µ 0 when all impact contacts are active or the distance-to-obstacle is large, and µ 10 2 when this distance approaches zero. To accomplish this, the adaptive µ(xt, ut) is designed as µ(xt, ut) = 10 2 100d2 + 1 4 = 10 2 100 min i I \|ϕ(xt, ut)(i)\|2 + 1 4. (C.2)