Online Control with Adversarial Disturbance for Continuous-time Linear Systems

Authors: Jingwei Li, Jing Dong, Can Chang, Baoxiang Wang, Jingzhao Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we apply our theoretical analysis to the practical training of agents. First we highlight the key difference between our algorithm and traditional online policy optimization. ... We conduct experiments on the hopper, halfcheetah, and walker2d benchmarks using the Mu Jo Co simulator [40].
Researcher Affiliation Academia Jingwei Li IIIS, Tsinghua University Shanghai Qizhi Institute ljw22@mails.tsinghua.edu.cn Jing Dong The Chinese University of Hong Kong, Shenzhen jingdong@link.cuhk.edu.cn Can Chang IIIS, Tsinghua University cc22@mails.tsinghua.edu.cn Baoxiang Wang The Chinese University of Hong Kong, Shenzhen bxiangwang@cuhk.edu.cn Jingzhao Zhang IIIS, Tsinghua University Shanghai Qi zhi Institute jingzhaoz@mail.tsinghua.edu.cn
Pseudocode Yes Algorithm 1 Continuous two-level online control algorithm
Open Source Code No Open access to data and code Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: Our code is very simple, just use the traditional SAC algorithm with one line implement. Our main contribution is the theoretical analysis.
Open Datasets Yes We conduct experiments on the hopper, halfcheetah, and walker2d benchmarks using the Mu Jo Co simulator [40]. ... Table 1: The DR distributions of environment.
Dataset Splits No The paper discusses training and testing but does not provide specific details on training/test/validation dataset splits (e.g., percentages, sample counts, or explicit cross-validation setup).
Hardware Specification Yes We conducted experiments using NVIDIA A40 graphics card.
Software Dependencies No The paper mentions the use of 'Mu Jo Co simulator [40]' and 'SAC (Soft Actor-Critic) algorithm [31]' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes Therefore, in the following experiments we fix the parameter h = 3, m = 3. We train our algorithm with this parameter and standard SAC on hopper and test the performance on more environments.