reproducibilityindex.ai

Conservative Objective Models for Effective Offline Model-Based Optimization

Authors: Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In practice, COMs outperform a number existing methods on a wide range of MBO problems, including optimizing controller parameters, robot morphologies, and superconducting materials. (from Abstract) and 6. Experimental Evaluation To evaluate the efﬁcacy of COMs for ofﬂine model-based optimization, we ﬁrst perform a comparative evaluation of COMs on four continuous ofﬂine MBO tasks based on problems in physical sciences, neural network design, and robotics, proposed in the design-bench benchmark (Trabucco et al., 2021). In addition, we perform an empirical analysis on COMs that aims to answer the following questions: (1) Is conservative training essential for improved performance and stability of COMs? How do COMs compare to a na ıve objective model in terms of stability?, (2) How does the trust-region optimizer improve the stability of optimizing COMs?, (3) Are COMs robust to hyperparameter choices and consistent to evaluation conditions.
Researcher Affiliation	Academia	Brandon Trabucco 1 Aviral Kumar 1 Xinyang Geng 1 Sergey Levine 1 Department of Electrical Engineering and Computer Sciences, University of California Berkeley.. Correspondence to: Brandon Trabucco <btrabucco@berkeley.edu>, Aviral Kumar <aviralk@berkeley.edu>.
Pseudocode	Yes	Algorithm 1 COM: Training Conservative Models and Algorithm 2 COM: Finding x (both on page 4).
Open Source Code	Yes	Code for reproducing our experimental results is available at https://github.com/ brandontrabucco/design-baselines. (from Section 6).
Open Datasets	Yes	To evaluate the efﬁcacy of COMs for ofﬂine model-based optimization, we ﬁrst perform a comparative evaluation of COMs on four continuous ofﬂine MBO tasks based on problems in physical sciences, neural network design, and robotics, proposed in the design-bench benchmark (Trabucco et al., 2021). (from Section 6).
Dataset Splits	No	The paper describes using static datasets from the `design-bench` benchmark but does not provide explicit details on training, validation, or test set splits, such as percentages or sample counts, within the paper itself.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies	No	The paper mentions using 'Adam optimizer (Kingma & Ba, 2015)' but does not specify any software versions for libraries like PyTorch, TensorFlow, or Python itself.
Experiment Setup	Yes	Brieﬂy, for all of our experiments, the conservative objective model ˆfθ is modeled as a neural network with two hidden layers of size 2048 each and leaky Re LU activations... In order to train this conservative objective model, we use the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 10 3... During optimization, we utilized the trust-region gradient-ascent optimizer with β = 0.9... Finally, in order to choose the time step T in Equation 4 that is supposed to provide us with the ﬁnal solution x = x T , we pick a large and universal time step of T = 450.