Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Non-rectangular Robust MDPs with Normed Uncertainty Sets
Authors: Navdeep Kumar, Adarsh Gupta, Maxence Mohamed ELFATIHI, Giorgia Ramponi, Kfir Y. Levy, Shie Mannor
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | The paper is theoretical in nature, the claims are reflected in abstract and introduction. Paper is theoretical in nature, toyish experiments are just for the sake of completeness. |
| Researcher Affiliation | Collaboration | Navdeep Kumar Technion Adarsh Gupta Finsynth.ai Maxence Mohamed Elfatihi École Polytechnique Giorgia Ramponi University of Zurich Kfir Levy Technion Shie Mannor Technion |
| Pseudocode | Yes | Algorithm 1 Binary Search for Robust Policy Evaluation, Algorithm 2 CPI Algorithm 3.2 of [21] for Robust Policy Evaluation, Algorithm 3 Robust Policy Gradient Algorithm, Algorithm 4 Second Order Spectral Approximation for max x 2 1,x 0 Ax 2, Algorithm 5 Spectral method for computing maxx B Ax 2 |
| Open Source Code | Yes | The codes, detailed explanations, and additional experiments are available at https://anonymous. 4open.science/r/non-rectangular-rmdp-77B8. Codes are publicly available. |
| Open Datasets | No | The experiments are performed using a randomly generated nominal kernel ˆP, reward function R, and policy π. An uncertainty set U1 is constructed using the nominal kernel with a fixed uncertainty radius β. |
| Dataset Splits | No | No training , no tests sets. |
| Hardware Specification | Yes | System details for the experiments are as follows: Operating System: mac OS Sequoia (Version 15.4.1), Chip: Apple M2, Cores: 8 (4 performance and 4 efficiency), Memory: 16 GB (LPDDR5). Hardware and Software Specifications The experiments were conducted on the following hardware and software setup: Model Name: Mac Book Pro (2023 model). Model Identifier: Mac14,7. Chip: Apple M2 with 8 cores (4 performance and 4 efficiency cores). Memory: 16 GB Unified Memory. Operating System: mac OS Ventura. |
| Software Dependencies | No | Programming Language: Python 3.9. Libraries Used: numpy for numerical computations. scipy for numerical optimization. matplotlib for generating plots. time for recording computational times. |
| Experiment Setup | Yes | The experiments are performed using a randomly generated nominal kernel ˆP, reward function R, and policy π. An uncertainty set U1 is constructed using the nominal kernel with a fixed uncertainty radius β. S=12 and A=8, γ=0.9 and convergence tolerance of 10 4. The discount factor is γ = 0.9. Algorithms are run until convergence (tolerance of 10 6) or a maximum iteration limit (100). |