Sub-optimal Experts mitigate Ambiguity in Inverse Reinforcement Learning
Authors: Riccardo Poiani, Curti Gabriele, Alberto Maria Metelli, Marcello Restelli
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We designed an experiment that aims at visualizing the reduction of the feasible reward set. ... We have then run our algorithm for 20 times using = 0.1 and = 0.1, and computed the (empirical) theoretical upper bound on . |
| Researcher Affiliation | Academia | Riccardo Poiani DEIB, Politecnico di Milano riccardo.poiani@polimi.it Gabriele Curti DEIB, Politecnico di Milano gabriele.curti@mail.polimi.it Alberto Maria Metelli DEIB, Politecnico di Milano albertomaria.metelli@polimi.it Marcello Restelli DEIB, Politecnico di Milano marcello.restelli@polimi.it |
| Pseudocode | Yes | Algorithm 1 US-IRL-SE Algorithm |
| Open Source Code | No | The codebase to reproduce the results will be public. |
| Open Datasets | Yes | We considered as environment the forest management scenarios with 10 states and 2 actions that is available in the 'pymdptoolbox' library. ... Code can be found at https://github.com/sawcordwell/pymdptoolbox. |
| Dataset Splits | No | The paper does not provide explicit training/validation/test dataset splits. It describes experimental settings for the algorithm (e.g., epsilon and delta for PAC framework), but not data partitioning for model validation. |
| Hardware Specification | Yes | The experiments have been run on a laptop with 8 Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz and 8GB of RAM. |
| Software Dependencies | Yes | We considered as environment the forest management scenarios with 10 states and 2 actions that is available in the 'pymdptoolbox' library. ... Master branch at commit 7c96789cc80e280437005c12065cf70266c11636 was used. |
| Experiment Setup | Yes | We considered a discount factor γ = 0.9. ... We have then run our algorithm for 20 times using ϵ = 0.1 and δ = 0.1 |