reproducibilityindex.ai

Data-Driven Offline Decision-Making via Invariant Representation Learning

Authors: Han Qi, Yi Su, Aviral Kumar, Sergey Levine

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we evaluate IOM on several tasks from Design-Bench [43] and find that it outperforms the best prior methods, and additionally, admits appealing offline tuning strategies unlike the prior methods. We evaluate on four tasks with continuous-valued input space from the Design-Bench [43] benchmark for ofﬂine model-based optimization (MBO). In Figure 2 (left), we show the median and interquantile mean (IQM) [1] for the aggregated scores across all tasks.
Researcher Affiliation	Collaboration	Han Qi , Yi Su , Aviral Kumar , Sergey Levine Department of Electrical Engineering and Computer Sciences, UC Berkeley {han2019, aviralk}@berkeley.edu, yisumtv@google.com
Pseudocode	Yes	Pseudocode for IOM is shown in Algorithm 1. (See Algorithm 1 on page 4)
Open Source Code	No	The paper thanks authors for help in setting up Design-Bench and COMs codebases, but does not state that the code for the method described in this paper (IOM) is open source or provide a link to it.
Open Datasets	Yes	We evaluate on four tasks with continuous-valued input space from the Design-Bench [43] benchmark for ofﬂine model-based optimization (MBO). [43] Brandon Trabucco, Xinyang Geng, Aviral Kumar, and Sergey Levine. Design-bench: Benchmarks for data-driven ofﬂine model-based optimization, 2021. URL https://github.com/ brandontrabucco/design-bench.
Dataset Splits	No	Then for each run, we record the validation in-distribution error and the value of the invariance regularizer on a validation set. Second, we now pick models that attain good performance within the training distribution by selecting λ values that attain the smallest validation prediction error: (f (φ(x)) y)2, in addition to picking the early stopping point based on the smallest validation error. While a validation set is mentioned and used for tuning, the paper does not specify the explicit split percentages or sample counts for how the validation data was created from the full dataset.
Hardware Specification	No	The paper mentions "compute resources from Google cloud" but does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	Yes	All experiments were run with Pytorch 1.8.1 and Python 3.8.5.
Experiment Setup	Yes	We model the representation φ(x) and the learned function f ( ) each as two-hidden layer Re LU networks with sizes 2048 and 1024, respectively. Input: training data D, number of gradient steps T = 50 to optimize µOPT starting from the training distribution µ, training iteration K, batch size n. λ denotes a weighting hyperparameter.