Implicitly Guided Design with PropEn: Match your Data to Follow the Gradient
Authors: NataĊĦa Tagasovska, Vladimir Gligorijevic, Kyunghyun Cho, Andreas Loukas
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive evaluations in toy problems and scientific applications, such as therapeutic protein design and airfoil optimization, demonstrate Prop En s advantages over common baselines. Notably, the protein design results are validated with wet lab experiments, confirming the competitiveness and effectiveness of our approach. |
| Researcher Affiliation | Collaboration | Prescient/MLDD, Genentech Research and Early Development Department of Computer Science, Center for Data Science, New York University |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/prescient-design/propen. |
| Open Datasets | No | The paper mentions datasets like NACA airfoils and therapeutic antibody proteins, but it does not provide concrete access information (link, DOI, specific repository, or citation with author/year for public access) for these datasets. It also mentions synthetic toy datasets which are generated, not external. |
| Dataset Splits | No | The paper mentions "We randomly select 0.1% as holdout dataset for seeds, and use the rest for training." and discusses "wet lab validation" as an experimental outcome. However, it does not explicitly define a separate validation dataset split with percentages or counts for model tuning during training. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used, such as GPU/CPU models, memory, or cloud instance types. It only generally refers to running experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies (e.g., Python 3.x, PyTorch 1.x, specific solver versions). It mentions using "Neural Foil [41]" but without a specific version. |
| Experiment Setup | Yes | ablation studies: N {100, 200}, d {2, 10, 50, 100} matching thresholds: x = y = 1, number of epochs: 500, batch size: 64 ... ablation studies: N {200, 500, 1000}, matching thresholds: x = y {0.3, 0.5, 0.7, 1} number of epochs: 1000, batch size: 100 ... Hyper-parameter choice. For optimizing the parameters of the baselines in the toy and engineering experiments, we conducted a grid search over the learning rate ([1e-2, 1e-5]), weight decay ([1e-2, 1e-5]), number of epochs ([300, 1000, 5000]), batch size (32, 64, 128), and number of neurons per layer ([30, 50, 100]). Therapeutic Proteins batch size: 32, training epochs: 300. |