ReDS: Offline RL With Heteroskedastic Datasets via Support Constraints
Authors: Anikait Singh, Aviral Kumar, Quan Vuong, Yevgen Chebotar, Sergey Levine
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on the D4RL benchmark suite (Fu et al., 2020) and demonstrate that REDS significantly outperforms existing offline RL baselines on several benchmarks across different domains and difficulties. |
| Researcher Affiliation | Industry | Yuanfu Liao Google Research yuanfuliao@google.com George Tucker Google Research tucker@google.com Ofir Nachum Google Research ofirnachum@google.com |
| Pseudocode | Yes | Algorithm 1 Robust Exploration with Dataset Heteroskedasticity via Support Constraints (REDS) |
| Open Source Code | Yes | Code is available at github.com/google-research/reds |
| Open Datasets | Yes | We conduct experiments on the D4RL benchmark suite (Fu et al., 2020), which consists of a set of locomotion and AntMaze tasks from continuous control, as well as Adroit and FrankaKitchen tasks with challenging robot manipulation datasets. |
| Dataset Splits | Yes | We use the standard D4RL splits for training, validation, and evaluation. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU/CPU models or cloud instance types. |
| Software Dependencies | No | We implement REDS using the Jax and Flax libraries. The paper does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | We train all models with a batch size of 256, a discount factor of 0.99, and learning rates of 10−4 for the policy and Q-functions, and 10−5 for the support function and β. |