reproducibilityindex.ai

Online Control with Adversarial Disturbance for Continuous-time Linear Systems

Authors: Jingwei Li, Jing Dong, Can Chang, Baoxiang Wang, Jingzhao Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we apply our theoretical analysis to the practical training of agents. First we highlight the key difference between our algorithm and traditional online policy optimization. ... We conduct experiments on the hopper, halfcheetah, and walker2d benchmarks using the Mu Jo Co simulator [40].
Researcher Affiliation	Academia	Jingwei Li IIIS, Tsinghua University Shanghai Qizhi Institute ljw22@mails.tsinghua.edu.cn Jing Dong The Chinese University of Hong Kong, Shenzhen jingdong@link.cuhk.edu.cn Can Chang IIIS, Tsinghua University cc22@mails.tsinghua.edu.cn Baoxiang Wang The Chinese University of Hong Kong, Shenzhen bxiangwang@cuhk.edu.cn Jingzhao Zhang IIIS, Tsinghua University Shanghai Qi zhi Institute jingzhaoz@mail.tsinghua.edu.cn
Pseudocode	Yes	Algorithm 1 Continuous two-level online control algorithm
Open Source Code	No	Open access to data and code Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: Our code is very simple, just use the traditional SAC algorithm with one line implement. Our main contribution is the theoretical analysis.
Open Datasets	Yes	We conduct experiments on the hopper, halfcheetah, and walker2d benchmarks using the Mu Jo Co simulator [40]. ... Table 1: The DR distributions of environment.
Dataset Splits	No	The paper discusses training and testing but does not provide specific details on training/test/validation dataset splits (e.g., percentages, sample counts, or explicit cross-validation setup).
Hardware Specification	Yes	We conducted experiments using NVIDIA A40 graphics card.
Software Dependencies	No	The paper mentions the use of 'Mu Jo Co simulator [40]' and 'SAC (Soft Actor-Critic) algorithm [31]' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	Therefore, in the following experiments we fix the parameter h = 3, m = 3. We train our algorithm with this parameter and standard SAC on hopper and test the performance on more environments.