Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Flexible MOF Generation with Torsion-Aware Flow Matching

Authors: Nayoung Kim, Seongsu Kim, Sungsoo Ahn

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate improved reconstruction accuracy, the generation of valid, novel, and unique MOFs, and the ability to create novel building blocks.
Researcher Affiliation Academia Nayoung Kim Seongsu Kim Sungsoo Ahn Korea Advanced Institute of Science and Technology (KAIST) EMAIL
Pseudocode Yes Algorithm 1 Canonicalization of rotation targets Algorithm 2 Training algorithm Algorithm 3 Inference algorithm Algorithm 4 Interaction module (Transformer encoder) Algorithm 5 Block attention pooling module (Block Attention Pool) Algorithm 6 Rotation head Algorithm 7 Translation head Algorithm 8 Lattice head Algorithm 9 Torsion head
Open Source Code Yes Our code is available at https://github.com/nayoung10/MOFFlow-2.
Open Datasets Yes Starting with the dataset from Boyd et al. [36], we apply metal-oxo decomposition algorithm from MOFid [37] and discard any structures containing more than 20 building blocks [11].
Dataset Splits Yes The resulting dataset is split into an 8:1:1 ratio for train/valid/test sets in the structure prediction task, and into a 9.5:0.5 train/valid split for MOF generation [11, 10].
Hardware Specification Yes We train our model on 8 80GB A100 GPUs for 200 epochs (about 4 days).
Software Dependencies No The paper mentions several software tools and libraries like RDKit [14], Open Babel [25], pymatgen [40], Zeo++ [41], MOFid [37], MOFChecker [38], Cry SPY [39], CHGNet [50], and x-transformers [43]. However, specific version numbers for these software dependencies are not provided in the text.
Experiment Setup Yes We use Adam W optimizer [51] with a learning rate of 1e-5, betas (0.9, 0.98), and no weight decay. Inference is performed in 50 steps. Max tokens 8000 Number of layers 6 Hidden dimension 1024 Number of heads 8 Rotary positional embedding True Flash attention True Scale normalization True Optimizer Adam W Learning rate 3e-4 Betas (0.9, 0.999) Weight decay 0.0 Epochs 20 Table 4: Hyperparameters for training the building block generator.