Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Flexible MOF Generation with Torsion-Aware Flow Matching
Authors: Nayoung Kim, Seongsu Kim, Sungsoo Ahn
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate improved reconstruction accuracy, the generation of valid, novel, and unique MOFs, and the ability to create novel building blocks. |
| Researcher Affiliation | Academia | Nayoung Kim Seongsu Kim Sungsoo Ahn Korea Advanced Institute of Science and Technology (KAIST) EMAIL |
| Pseudocode | Yes | Algorithm 1 Canonicalization of rotation targets Algorithm 2 Training algorithm Algorithm 3 Inference algorithm Algorithm 4 Interaction module (Transformer encoder) Algorithm 5 Block attention pooling module (Block Attention Pool) Algorithm 6 Rotation head Algorithm 7 Translation head Algorithm 8 Lattice head Algorithm 9 Torsion head |
| Open Source Code | Yes | Our code is available at https://github.com/nayoung10/MOFFlow-2. |
| Open Datasets | Yes | Starting with the dataset from Boyd et al. [36], we apply metal-oxo decomposition algorithm from MOFid [37] and discard any structures containing more than 20 building blocks [11]. |
| Dataset Splits | Yes | The resulting dataset is split into an 8:1:1 ratio for train/valid/test sets in the structure prediction task, and into a 9.5:0.5 train/valid split for MOF generation [11, 10]. |
| Hardware Specification | Yes | We train our model on 8 80GB A100 GPUs for 200 epochs (about 4 days). |
| Software Dependencies | No | The paper mentions several software tools and libraries like RDKit [14], Open Babel [25], pymatgen [40], Zeo++ [41], MOFid [37], MOFChecker [38], Cry SPY [39], CHGNet [50], and x-transformers [43]. However, specific version numbers for these software dependencies are not provided in the text. |
| Experiment Setup | Yes | We use Adam W optimizer [51] with a learning rate of 1e-5, betas (0.9, 0.98), and no weight decay. Inference is performed in 50 steps. Max tokens 8000 Number of layers 6 Hidden dimension 1024 Number of heads 8 Rotary positional embedding True Flash attention True Scale normalization True Optimizer Adam W Learning rate 3e-4 Betas (0.9, 0.999) Weight decay 0.0 Epochs 20 Table 4: Hyperparameters for training the building block generator. |