ReDS: Offline RL With Heteroskedastic Datasets via Support Constraints

Authors: Anikait Singh, Aviral Kumar, Quan Vuong, Yevgen Chebotar, Sergey Levine

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on the D4RL benchmark suite (Fu et al., 2020) and demonstrate that REDS significantly outperforms existing offline RL baselines on several benchmarks across different domains and difficulties.
Researcher Affiliation Industry Yuanfu Liao Google Research yuanfuliao@google.com George Tucker Google Research tucker@google.com Ofir Nachum Google Research ofirnachum@google.com
Pseudocode Yes Algorithm 1 Robust Exploration with Dataset Heteroskedasticity via Support Constraints (REDS)
Open Source Code Yes Code is available at github.com/google-research/reds
Open Datasets Yes We conduct experiments on the D4RL benchmark suite (Fu et al., 2020), which consists of a set of locomotion and AntMaze tasks from continuous control, as well as Adroit and FrankaKitchen tasks with challenging robot manipulation datasets.
Dataset Splits Yes We use the standard D4RL splits for training, validation, and evaluation.
Hardware Specification No The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU/CPU models or cloud instance types.
Software Dependencies No We implement REDS using the Jax and Flax libraries. The paper does not provide specific version numbers for these software dependencies.
Experiment Setup Yes We train all models with a batch size of 256, a discount factor of 0.99, and learning rates of 10−4 for the policy and Q-functions, and 10−5 for the support function and β.