reproducibilityindex.ai

Uncertainty Estimation Using Riemannian Model Dynamics for Offline Reinforcement Learning

Authors: Guy Tennenholtz, Shie Mannor

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We leverage our method for uncertainty estimation in a pessimistic model-based framework, showing a significant improvement upon contemporary model-based offline approaches on continuous control and autonomous driving benchmarks.
Researcher Affiliation	Collaboration	Guy Tennenholtz Technion Institute of Technology Shie Mannor Technion Institute of Technology & Nvidia Research
Pseudocode	Yes	Algorithm 1 GELATO: Geometrically Enriched LATent model for Offline reinforcement learning
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] (Section 1a of checklist)
Open Datasets	Yes	We used D4RL [Fu et al., 2020] and the autonomous vehicle environments highway-env [Leurent, 2018] as benchmarks for all of our experiments.
Dataset Splits	No	The paper describes datasets like D4RL and highway-env and mentions training with 1M or 2M samples, but it does not specify explicit train/validation/test splits or percentages.
Hardware Specification	Yes	All agents were trained... using a single GPU (RTX 2080)...
Software Dependencies	No	No specific software dependencies with version numbers were provided. The paper mentions using 'FAISS', 'Soft Learning', and 'PPO' but without version details.
Experiment Setup	Yes	We set k = 5 neighbors for the penalized reward (Equation (3)). All agents were trained for 1M steps (for continuous control benchmarks) and 350K steps (for the driving benchmarks)... and averaged over 5 seeds.