Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ReDS: Offline RL With Heteroskedastic Datasets via Support Constraints

Authors: Anikait Singh, Aviral Kumar, Quan Vuong, Yevgen Chebotar, Sergey Levine

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on the D4RL benchmark suite (Fu et al., 2020) and demonstrate that REDS significantly outperforms existing offline RL baselines on several benchmarks across different domains and difficulties.
Researcher Affiliation Industry Yuanfu Liao Google Research EMAIL George Tucker Google Research EMAIL Ofir Nachum Google Research EMAIL
Pseudocode Yes Algorithm 1 Robust Exploration with Dataset Heteroskedasticity via Support Constraints (REDS)
Open Source Code Yes Code is available at github.com/google-research/reds
Open Datasets Yes We conduct experiments on the D4RL benchmark suite (Fu et al., 2020), which consists of a set of locomotion and AntMaze tasks from continuous control, as well as Adroit and FrankaKitchen tasks with challenging robot manipulation datasets.
Dataset Splits Yes We use the standard D4RL splits for training, validation, and evaluation.
Hardware Specification No The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU/CPU models or cloud instance types.
Software Dependencies No We implement REDS using the Jax and Flax libraries. The paper does not provide specific version numbers for these software dependencies.
Experiment Setup Yes We train all models with a batch size of 256, a discount factor of 0.99, and learning rates of 10−4 for the policy and Q-functions, and 10−5 for the support function and β.