reproducibilityindex.ai

Implicitly Guided Design with PropEn: Match your Data to Follow the Gradient

Authors: Nataša Tagasovska, Vladimir Gligorijevic, Kyunghyun Cho, Andreas Loukas

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive evaluations in toy problems and scientific applications, such as therapeutic protein design and airfoil optimization, demonstrate Prop En s advantages over common baselines. Notably, the protein design results are validated with wet lab experiments, confirming the competitiveness and effectiveness of our approach.
Researcher Affiliation	Collaboration	Prescient/MLDD, Genentech Research and Early Development Department of Computer Science, Center for Data Science, New York University
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/prescient-design/propen.
Open Datasets	No	The paper mentions datasets like NACA airfoils and therapeutic antibody proteins, but it does not provide concrete access information (link, DOI, specific repository, or citation with author/year for public access) for these datasets. It also mentions synthetic toy datasets which are generated, not external.
Dataset Splits	No	The paper mentions "We randomly select 0.1% as holdout dataset for seeds, and use the rest for training." and discusses "wet lab validation" as an experimental outcome. However, it does not explicitly define a separate validation dataset split with percentages or counts for model tuning during training.
Hardware Specification	No	The paper does not provide specific details on the hardware used, such as GPU/CPU models, memory, or cloud instance types. It only generally refers to running experiments.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies (e.g., Python 3.x, PyTorch 1.x, specific solver versions). It mentions using "Neural Foil [41]" but without a specific version.
Experiment Setup	Yes	ablation studies: N {100, 200}, d {2, 10, 50, 100} matching thresholds: x = y = 1, number of epochs: 500, batch size: 64 ... ablation studies: N {200, 500, 1000}, matching thresholds: x = y {0.3, 0.5, 0.7, 1} number of epochs: 1000, batch size: 100 ... Hyper-parameter choice. For optimizing the parameters of the baselines in the toy and engineering experiments, we conducted a grid search over the learning rate ([1e-2, 1e-5]), weight decay ([1e-2, 1e-5]), number of epochs ([300, 1000, 5000]), batch size (32, 64, 128), and number of neurons per layer ([30, 50, 100]). Therapeutic Proteins batch size: 32, training epochs: 300.