Impact of Computation in Integral Reinforcement Learning for Continuous-Time Control

Authors: Wenhan Cao, Wei Pan

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental These theoretical findings are finally validated by two canonical control tasks.
Researcher Affiliation Collaboration 1Department of Computer Science, University of Manchester 2School of Vehicle and Mobility, Tsinghua University
Pseudocode No The paper describes theoretical models and iterative processes, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/anonymity678/Computation-Impacts-Control.git.
Open Datasets No The paper defines and uses "canonical linear-quadratic regulator problem" and a "canonical nonlinear system" as examples. These are problem definitions for simulation, not references to or access information for publicly available datasets with formal citations or links.
Dataset Splits No The paper defines and evaluates on specific control tasks, but it does not describe standard dataset splits like train/validation/test percentages or sample counts.
Hardware Specification No The paper mentions running "simulations" but provides no specific details about the hardware (e.g., CPU, GPU models, memory) used for these experiments.
Software Dependencies No The paper describes mathematical methods and algorithms (e.g., trapezoidal rule, BQ with Mat ern kernel, Euler method, Runge-Kutta series) but does not specify any particular software names with version numbers (e.g., Python 3.x, PyTorch 1.x, MATLAB R20xx) used in the experiments.
Experiment Setup Yes The initial policy of the PI is chosen as an admissible policy u = K0x with K0 = [0, 0, 0]. We use the trapezoidal rule and the BQ with Mat ern kernel (smoothness parameter b = 4) for evenly spaced samples with size 5 N 15 to compute the PEV step of Int RL.