Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Unified all-atom molecule generation with neural fields

Authors: Matthieu Kirchmeyer, Pedro O O. Pinheiro, Emma Willett, Karolis Martinkus, Joseph Kleinhenz, Emily Makowski, Andrew Watkins, Vladimir Gligorijevic, Richard Bonneau, Saeed Saremi

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Func Bind achieves competitive in silico performance in generating small molecules, macrocyclic peptides, and antibody complementarity-determining region loops, conditioned on target structures. Func Bind also generated in vitro novel antibody binders via de novo redesign of the complementarity-determining region H3 loop of two chosen co-crystal structures.
Researcher Affiliation	Industry	Matthieu Kirchmeyer1, Pedro O. Pinheiro1, Emma Willett1 Karolis Martinkus1, Joseph Kleinhenz1 Emily K. Makowski2 Andrew M. Watkins1 Vladimir Gligorijevic1 Richard Bonneau1 Saeed Saremi1 1Prescient Design, Genentech 2Antibody Engineering, Genentech
Pseudocode	No	The paper describes the methodology in detail in Section 3, including equations and descriptions of the model architecture and training process, but does not present a distinct pseudocode block or algorithm figure.
Open Source Code	Yes	The code is available at https://github.com/prescient-design/funcbind. The checkpoints at https://huggingface.co/mkirchmeyer/funcbind/.
Open Datasets	Yes	The code is available at https://github.com/prescient-design/funcbind. The checkpoints at https://huggingface.co/mkirchmeyer/funcbind/. ... we introduce a new dataset and benchmark for structure-conditioned macrocyclic peptide generation. The code is available at https://github.com/prescient-design/funcbind. The checkpoints at https://huggingface.co/mkirchmeyer/funcbind/. Equal contribution, work done at Genentech. Correspondence to EMAIL, EMAIL, EMAIL ... We train Func Bind on structures from three drug modalities: small molecules, macrocyclic peptides (MCPs), and antibody complementarity-determining region (CDR) loops in complex with a target protein. ... We also create a new dataset , containing 190,000 synthetic MCP/protein complexes derived from 641 RCSB PDB structures [19], particularly relevant for this work, as cyclic peptides exhibit chemistry and function that span small and large molecule modalities. available at https://huggingface.co/datasets/Willete3/mcpp_dataset ... Data. We consider the standard Cross Docked2020 [69] benchmark, with the pre-processing and splitting strategy of [70]. ... Data. We consider the Sab Dab dataset [78], which comprises antibody-protein co-crystal structures and the data splits from Diff Ab [39].
Dataset Splits	Yes	Data. We consider the standard Cross Docked2020 [69] benchmark, with the pre-processing and splitting strategy of [70]. Pockets are clustered at a sequence identity of < 30% using MMseqs2 and are split into 99,900 train ligand pockets pairs, 100 validation pairs and 100 test pairs. ... Data. We consider the Sab Dab dataset [78], which comprises antibody-protein co-crystal structures and the data splits from Diff Ab [39]. This non-i.i.d. split ensures that antibodies similar to those of the test set (i.e.more than 50% CDR H3 identity) are removed from the training set. The test split includes 19 targets, for which we redesign each CDR loop individually. ... We split the dataset into train, test and validation subsets using a clustering approach detailed in Section E that aims at creating a non-i.i.d. test set consisting of 85 protein pockets.
Hardware Specification	Yes	Batch size is 32 over 1 B200 GPU; we sample 15000 coordinates per batch.
Software Dependencies	Yes	We measure affinity with three metrics using Auto Dock Vina [71]: Vina Score is the docking score of the generated molecule... All methods but Decomp Diff and Mol Craft rely Open Babel [68] to assign bonds from generated atom coordinates. ... We also compute the drug-likeness, QED [72], and synthesizability, SA [73], score of the generated molecules with RDKit [74]. Reference [71] states "Autodock vina 1.2. 0: New docking methods, expanded force field, and python bindings."
Experiment Setup	Yes	The auto-encoder is trained with Adam Optimizer [86] with learning rate 10 2, β1 = 0.9, β2 = 0.999. We apply a KL regularization weight of λ = 10 5. Batch size is 32 over 1 B200 GPU; we sample 15000 coordinates per batch. ... The parameters are optimized with Adam optimizer [86] with learning rate αref = 10 2, β1 = 0.9, β2 = 0.95 using an aggregated batch size of 768 over 8 B200 GPUs. We perform early stopping on the validation loss. We use the power function exponential moving average from EDM2 [62] with an EMA length of 5%. Moreover, we adopt the inverse square root decay schedule of [86], also used in [62] which sets α(t) = αref p max(t/tref, 1) , where we set tref = 20040. Finally, the networks are trained by randomly dropping the conditioning information 10% of the time.