Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Improved motif-scaffolding with SE(3) flow matching

Authors: Jason Yim, Andrew Campbell, Emile Mathieu, Andrew Y. K. Foong, Michael Gastegger, Jose Jimenez-Luna, Sarah Lewis, Victor Garcia Satorras, Bastiaan S. Veeling, Frank Noe, Regina Barzilay, Tommi Jaakkola

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On a benchmark of 24 biologically meaningful motifs, we show our method achieves 2.5 times more designable and unique motif-scaffolds compared to state-of-the-art. Code: https: //github.com/microsoft/protein-frame-flow... In this section, we report the results of training Frame Flow for motif-scaffolding. Sec. 5.1 describes training, sampling, and metrics. Our main results on motif-scafolding are reported in Sec. 5.2 on the benchmark introduced in RFdiffusion. Additional motif-scaffolding analysis is provided in App. G.
Researcher Affiliation	Collaboration	Jason Yim EMAIL Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology; Andrew Campbell EMAIL Department of Statistics University of Oxford; Emile Mathieu EMAIL Department of Engineering University of Cambridge; Andrew Y. K. Foong EMAIL Microsoft Research AI4Science
Pseudocode	Yes	Algorithm 1 Motif-scaffolding data augmentation Require: Protein backbone T; Min and max motif percent γmin = 0.05, γmax = 0.5. 1: s Uniform{ N γmin , . . . , N γmax } Sample maximum motif size. 2: m Uniform{1, . . . , s} Sample maximum number of motifs. 3: TM 4: for i {1, . . . , m} do 5: j Uniform{1, . . . , N} \ TM Sample location for each motif 6: ℓ Uniform{1, . . . , s m + i \|TM\|} Sample length of each motif. 7: TM TM {Tj, . . . , Tmin(j+ℓ,N)} Append to existing motif. 8: end for 9: TS {T1, . . . , TN} \ TM Assign rest of residues as the scaffold 10: return TM, TS
Open Source Code	Yes	Code: https: //github.com/microsoft/protein-frame-flow
Open Datasets	Yes	To provide a controlled comparison, we train unconditional and conditional versions of Frame Flow on a dataset of monomers from the Protein Data Bank (PDB) (Berman et al., 2000). Our results provide a clear comparison of the modeling choices made when performing motif-scaffolding with Frame Flow... Both models are trained using the filtered PDB monomer dataset introduced in Frame Diff.
Dataset Splits	No	The flow matching loss from Eq. (7) involves sampling from p(TM) and p1(TS 1 \|TM), which we do not have access to, but can be approximated using unlabeled structures from the PDB. Our pseudo-labeled motifs and scaffolds are generated as follows (also depicted in Fig. 2). First, a protein structure is sampled from the PDB dataset. Second, a random number of residues are selected to be the starting locations of each motif. Third, additional residues are appended onto each motif thereby extending their lengths. The length of each motif is randomly sampled such that the total number of motif residues is between γmin and γmax percent of all the residues. We use γmin = 0.05 and γmax = 0.5 to ensure at least a few residues are used as the motif but not more than half the protein. Finally, the remaining residues are treated as the scaffold and corrupted. The motif and scaffold are treated as samples from p(TM) and p1(TS 1 \|TM) respectively. Importantly, each protein will be re-used on subsequent epochs where new motifs and scaffolds will be sampled.
Hardware Specification	Yes	We train each model for 6 days on 2 A6000 NVIDIA GPUs with dynamic batch sizes depending on the length of the proteins in each batch a technique from Frame Diff... In the last column we give the number of seconds to sample a length 100 protein on a A6000 Nvidia GPU with each method.
Software Dependencies	No	We use the ADAM optimizer (Kingma & Ba, 2014) with learning rate 0.0001... Following RFdiffusion, we use Protein MPNN at temperature 0.1 to generate 8 sequences for each backbone... The predicted backbone of each sequence is obtained with the fourth model in the five model ensemble used in AF2 with 0 recycling, no relaxation, and no multiple sequence alignment (MSA).
Experiment Setup	Yes	We train each model for 6 days on 2 A6000 NVIDIA GPUs with dynamic batch sizes depending on the length of the proteins in each batch a technique from Frame Diff. We use the ADAM optimizer (Kingma & Ba, 2014) with learning rate 0.0001... We use the Euler-Maruyama integrator with 500 timesteps for all sampling... For each motif, the method must sample novel scaffolds with different lengths and different motif locations along the sequence... Following RFdiffusion, we use Protein MPNN at temperature 0.1 to generate 8 sequences for each backbone in motif-scaffolding... The predicted backbone of each sequence is obtained with the fourth model in the five model ensemble used in AF2 with 0 recycling, no relaxation, and no multiple sequence alignment (MSA).