RoMA: Robust Model Adaptation for Offline Model-based Optimization
Authors: Sihyun Yu, Sungsoo Ahn, Le Song, Jinwoo Shin
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments under various tasks show the effectiveness of Ro MA compared with previous methods, obtaining state-of-the-art results, e.g., Ro MA outperforms all at 4 out of 6 tasks and achieves runner-up results at the remaining tasks. |
| Researcher Affiliation | Collaboration | Sihyun Yu1 Sungsoo Ahn2 Le Song2,3 Jinwoo Shin1 1Korea Advanced Institute of Science and Technology (KAIST) 2Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) 3Bio Map |
| Pseudocode | Yes | Algorithm 1 Robust model adaptation (Ro MA) |
| Open Source Code | No | The paper mentions 'Design-Bench [53]' with a GitHub link (https://github.com/brandontrabucco/design-bench) for the benchmark itself, but does not provide a concrete access link or explicit statement for the open-source code of their own methodology (Ro MA). |
| Open Datasets | Yes | We verify the effectiveness of our framework on Design-bench [53], an offline model-based optimization (MBO) benchmark consisting of 6 tasks with various domains. The objective of green fluorescence protein (GFP) is to find a protein that has a high fluorescence, which is proposed by Sarkisyan et al. [41]. The dataset consists of a total 5000 number of proteins with fluorescence values... In the Molecule task, one finds a substructure of a molecule... The dataset involves 4,216 data points... The Superconductor (Supercond.) task aims to find a superconducting material... the dataset is provided by Hamidieh [17] and consists of 21,263 data in total. |
| Dataset Splits | No | The paper states 'we follow the same setup for all experiments in prior works for the evaluation [11, 52, 53]' which implies using predefined benchmark setups, but it does not explicitly provide specific percentages, sample counts, or direct citations for train/validation/test dataset splits within the paper itself. |
| Hardware Specification | Yes | All the experiments are processed with 4 GPUs (NVIDIA RTX 2080 Ti) and 24 instances from a virtual CPU (Intel Xeon Silver 4214 CPU @ 2.20GHz), and it takes at most 4 hours to run each task over 16 runs. |
| Software Dependencies | No | The paper mentions 'Adam optimizer [23]' and 'multi-layer perceptron (MLP)' but does not provide specific version numbers for these or other software dependencies, such as Python or PyTorch versions. |
| Experiment Setup | Yes | We use a 3-layer multi-layer perceptron (MLP) in all experiments, with the width size 64 and the softplus activation function. Adam optimizer [23] with the learning rate of 0.001 is utilized to pre-train the proxy model with the dataset of each task. Gradients are clipped by a norm of 1.0, and we set the mini-batch size as 128. For the main experiments, we set the number of solution update to be large enough, i.e., T = 300. For the coefficient for regularization α, we set α = 1 across all tasks. We choose the largest maximum magnitude of a weight perturbation ε so that the pre-training of the proxy model is possible. Specifically, we set ε = 0.0005 for GFP, Molecule, and Superconductor task and choose ε = 0.005 at other 3 tasks. |