RoMA: Robust Model Adaptation for Offline Model-based Optimization

Authors: Sihyun Yu, Sungsoo Ahn, Le Song, Jinwoo Shin

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments under various tasks show the effectiveness of Ro MA compared with previous methods, obtaining state-of-the-art results, e.g., Ro MA outperforms all at 4 out of 6 tasks and achieves runner-up results at the remaining tasks.
Researcher Affiliation Collaboration Sihyun Yu1 Sungsoo Ahn2 Le Song2,3 Jinwoo Shin1 1Korea Advanced Institute of Science and Technology (KAIST) 2Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) 3Bio Map
Pseudocode Yes Algorithm 1 Robust model adaptation (Ro MA)
Open Source Code No The paper mentions 'Design-Bench [53]' with a GitHub link (https://github.com/brandontrabucco/design-bench) for the benchmark itself, but does not provide a concrete access link or explicit statement for the open-source code of their own methodology (Ro MA).
Open Datasets Yes We verify the effectiveness of our framework on Design-bench [53], an offline model-based optimization (MBO) benchmark consisting of 6 tasks with various domains. The objective of green fluorescence protein (GFP) is to find a protein that has a high fluorescence, which is proposed by Sarkisyan et al. [41]. The dataset consists of a total 5000 number of proteins with fluorescence values... In the Molecule task, one finds a substructure of a molecule... The dataset involves 4,216 data points... The Superconductor (Supercond.) task aims to find a superconducting material... the dataset is provided by Hamidieh [17] and consists of 21,263 data in total.
Dataset Splits No The paper states 'we follow the same setup for all experiments in prior works for the evaluation [11, 52, 53]' which implies using predefined benchmark setups, but it does not explicitly provide specific percentages, sample counts, or direct citations for train/validation/test dataset splits within the paper itself.
Hardware Specification Yes All the experiments are processed with 4 GPUs (NVIDIA RTX 2080 Ti) and 24 instances from a virtual CPU (Intel Xeon Silver 4214 CPU @ 2.20GHz), and it takes at most 4 hours to run each task over 16 runs.
Software Dependencies No The paper mentions 'Adam optimizer [23]' and 'multi-layer perceptron (MLP)' but does not provide specific version numbers for these or other software dependencies, such as Python or PyTorch versions.
Experiment Setup Yes We use a 3-layer multi-layer perceptron (MLP) in all experiments, with the width size 64 and the softplus activation function. Adam optimizer [23] with the learning rate of 0.001 is utilized to pre-train the proxy model with the dataset of each task. Gradients are clipped by a norm of 1.0, and we set the mini-batch size as 128. For the main experiments, we set the number of solution update to be large enough, i.e., T = 300. For the coefficient for regularization α, we set α = 1 across all tasks. We choose the largest maximum magnitude of a weight perturbation ε so that the pre-training of the proxy model is possible. Specifically, we set ε = 0.0005 for GFP, Molecule, and Superconductor task and choose ε = 0.005 at other 3 tasks.