Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

End-to-End Learning Framework for Solving Non-Markovian Optimal Control

Authors: Xiaole Zhang, Peiyu Zhang, Xiongye Xiao, Shixuan Li, Vasileios Tzoumas, Vijay Gupta, Paul Bogdan

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4. Experiments: In this section, we empirically evaluate the proposed system identification algorithm and FOLOC framework using both synthetic and real-world system dynamics. ... Experimental results indicate that our method accurately approximates fractional-order system behaviors without relying on Gaussian noise assumptions, pointing to promising avenues for advanced optimal control.
Researcher Affiliation Academia 1Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA 90089, USA 2Min H. Kao Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN 37996, USA 3Department of Aerospace Engineering, University of Michigan, Ann Arbor, MI 48109, USA 4Elmore Family School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907, USA.
Pseudocode No The paper describes the methodology using mathematical derivations and prose, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes A. Code Availability: The source code is available at https://github.com/zpykillcc/Fractional-Order-Learning-for-Optimal-Control-Framework.
Open Datasets No E.2. Synthetic Data: The synthetic data is generated as follows: ... E.3. Real-World System Dynamics: In our experiments, we simulate the system dynamics to generate trajectories and optimal control sequences... The paper does not provide concrete access information (specific link, DOI, repository name, formal citation with authors/year) for any publicly available or open dataset.
Dataset Splits No We are given N trajectories, each trajectory consisting of T time steps... We show the model performance with different number of training samples N = 1000, 2000, 4000, 6000, 8000. The paper mentions training sizes but does not specify explicit train/test/validation splits (e.g., percentages or counts) for the generated data.
Hardware Specification Yes The experiments utilizing GPU acceleration on an NVIDIA A100 80GB PCIe GPU. The experiments are conducted on a machine running Ubuntu 22.04.5 LTS with an Intel(R) Xeon(R) Platinum 8358 CPU (2.60 GHz), featuring 128 cores and support for 256 concurrent threads.
Software Dependencies No The training process is configured to optimize model performance using the Adam optimizer with a learning rate of 10-3. The model is trained for 300 epochs with a batch size of 128, Learning rate scheduling is managed by a ReduceLROnPlateau scheduler... The paper mentions specific optimizers and schedulers but does not provide version numbers for any programming languages or libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes F. Training Parameters: The training process is configured to optimize model performance using the Adam optimizer with a learning rate of 10-3. The model is trained for 300 epochs with a batch size of 128, Learning rate scheduling is managed by a ReduceLROnPlateau scheduler, which monitors validation loss and reduces the learning rate by a factor of 0.1 after 5 epochs without improvement. The scheduler uses a relative threshold of 0.0001 to detect improvement. The minimum learning rate is set to zero, and early learning stabilization is facilitated by an epsilon value of 1e-8. The loss function based on the Lp-norm.