Conservative Objective Models for Effective Offline Model-Based Optimization
Authors: Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In practice, COMs outperform a number existing methods on a wide range of MBO problems, including optimizing controller parameters, robot morphologies, and superconducting materials. (from Abstract) and 6. Experimental Evaluation To evaluate the efficacy of COMs for offline model-based optimization, we first perform a comparative evaluation of COMs on four continuous offline MBO tasks based on problems in physical sciences, neural network design, and robotics, proposed in the design-bench benchmark (Trabucco et al., 2021). In addition, we perform an empirical analysis on COMs that aims to answer the following questions: (1) Is conservative training essential for improved performance and stability of COMs? How do COMs compare to a na ıve objective model in terms of stability?, (2) How does the trust-region optimizer improve the stability of optimizing COMs?, (3) Are COMs robust to hyperparameter choices and consistent to evaluation conditions. |
| Researcher Affiliation | Academia | Brandon Trabucco 1 Aviral Kumar 1 Xinyang Geng 1 Sergey Levine 1 Department of Electrical Engineering and Computer Sciences, University of California Berkeley.. Correspondence to: Brandon Trabucco <btrabucco@berkeley.edu>, Aviral Kumar <aviralk@berkeley.edu>. |
| Pseudocode | Yes | Algorithm 1 COM: Training Conservative Models and Algorithm 2 COM: Finding x (both on page 4). |
| Open Source Code | Yes | Code for reproducing our experimental results is available at https://github.com/ brandontrabucco/design-baselines. (from Section 6). |
| Open Datasets | Yes | To evaluate the efficacy of COMs for offline model-based optimization, we first perform a comparative evaluation of COMs on four continuous offline MBO tasks based on problems in physical sciences, neural network design, and robotics, proposed in the design-bench benchmark (Trabucco et al., 2021). (from Section 6). |
| Dataset Splits | No | The paper describes using static datasets from the `design-bench` benchmark but does not provide explicit details on training, validation, or test set splits, such as percentages or sample counts, within the paper itself. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer (Kingma & Ba, 2015)' but does not specify any software versions for libraries like PyTorch, TensorFlow, or Python itself. |
| Experiment Setup | Yes | Briefly, for all of our experiments, the conservative objective model ˆfθ is modeled as a neural network with two hidden layers of size 2048 each and leaky Re LU activations... In order to train this conservative objective model, we use the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 10 3... During optimization, we utilized the trust-region gradient-ascent optimizer with β = 0.9... Finally, in order to choose the time step T in Equation 4 that is supposed to provide us with the final solution x = x T , we pick a large and universal time step of T = 450. |