Robust Model-Based Optimization for Challenging Fitness Landscapes
Authors: Saba Ghaffari, Ehsan Saleh, Alex Schwing, Yu-Xiong Wang, Martin D. Burke, Saurabh Sinha
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our comprehensive benchmark on real and semi-synthetic protein datasets as well as solution design for physics-informed neural networks, showcases the generality of our approach in discrete and continuous design spaces. |
| Researcher Affiliation | Academia | 1University of Illinois Urbana-Champaign, 2Georgia Institute of Technology {sabag2, ehsans2, aschwing, yxw, mdburke}@illinois.edu, saurabh.sinha@bme.gatech.edu |
| Pseudocode | No | The paper describes the proposed method using mathematical equations and textual explanations, but it does not include a dedicated pseudocode or algorithm block. |
| Open Source Code | Yes | Our implementation is available at https://github.com/sabagh1994/PGVAE. |
| Open Datasets | Yes | Our comprehensive benchmark on real and semi-synthetic protein datasets as well as solution design for physics-informed neural networks, showcases the generality of our approach in discrete and continuous design spaces. ... The dataset contains synthetic property values that are monotonically decreasing with the digit class. ... We chose the popular AAV (Adeno-associated virus) dataset (Bryant et al., 2021) ... The dataset was obtained from (Dallago et al., 2021). ... two popular protein datasets GB1 (Wu et al., 2016) and Pho Q (Podgornaia & Laub, 2015). |
| Dataset Splits | No | The paper describes how "train sets were generated" for various datasets and how imbalance and separation were varied, but it does not specify explicit training/validation/test dataset splits in percentages or by sample count for reproducibility of the partitioning of the original data. The evaluation metric Ymax focuses on the relative improvement of found properties, implying an iterative search process rather than a standard validation set. |
| Hardware Specification | No | The paper mentions that the work 'utilized resources supported by 1) the National Science Foundation s Major Research Instrumentation program, grant No. 1725729 (Kindratenko et al., 2020), and 2) the Delta advanced computing and data resource which is supported by the National Science Foundation (award OAC 2005572) and the State of Illinois.' However, these descriptions do not include specific hardware details such as GPU/CPU models or memory specifications. |
| Software Dependencies | No | The paper mentions using specific VAE architectures (Table A2) and relying on implementations by Brookes et al. (2019) for baseline methods. However, it does not provide specific software dependencies with version numbers, such as programming languages or deep learning frameworks (e.g., 'Python 3.8', 'PyTorch 1.9'). |
| Experiment Setup | Yes | We performed 10 rounds of MBO on the GMM benchmark and 20 rounds of MBO on the rest of the benchmark datasets. In all experiments temperature (τ) was set to five for PPGVAE with no further tuning. We used the implementation and hyper-parameters provided by (Brookes et al., 2019), for Cb AS, Bombarelli, RWR, and CEM-PI methods. The architecture of VAE was the same for all methods (Table A2). |