Uncertainty Estimation Using Riemannian Model Dynamics for Offline Reinforcement Learning

Authors: Guy Tennenholtz, Shie Mannor

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We leverage our method for uncertainty estimation in a pessimistic model-based framework, showing a significant improvement upon contemporary model-based offline approaches on continuous control and autonomous driving benchmarks.
Researcher Affiliation Collaboration Guy Tennenholtz Technion Institute of Technology Shie Mannor Technion Institute of Technology & Nvidia Research
Pseudocode Yes Algorithm 1 GELATO: Geometrically Enriched LATent model for Offline reinforcement learning
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] (Section 1a of checklist)
Open Datasets Yes We used D4RL [Fu et al., 2020] and the autonomous vehicle environments highway-env [Leurent, 2018] as benchmarks for all of our experiments.
Dataset Splits No The paper describes datasets like D4RL and highway-env and mentions training with 1M or 2M samples, but it does not specify explicit train/validation/test splits or percentages.
Hardware Specification Yes All agents were trained... using a single GPU (RTX 2080)...
Software Dependencies No No specific software dependencies with version numbers were provided. The paper mentions using 'FAISS', 'Soft Learning', and 'PPO' but without version details.
Experiment Setup Yes We set k = 5 neighbors for the penalized reward (Equation (3)). All agents were trained for 1M steps (for continuous control benchmarks) and 350K steps (for the driving benchmarks)... and averaged over 5 seeds.