Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

ET-Flow: Equivariant Flow-Matching for Molecular Conformer Generation

Authors: Majdi Hassan, Nikhil Shenoy, Jungyoon Lee, Hannes Stärk, Stephan Thaler, Dominique Beaini

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically evaluate ET-Flow by comparing the generated and ground-truth conformers in terms of distance-based RMSD (Section 4.2) and chemical property based metrics (Section 4.4). We present the general experimental setups in Section 4.1.
Researcher Affiliation Collaboration 1Mila & Université de Montréal 2University of British-Columbia 3Massachusetts Institute of Technology 4Valence Labs
Pseudocode Yes Algorithm 1: Training procedure, Algorithm 2: Inference procedure, Algorithm 3: Stochastic Sampler
Open Source Code Yes Code is available https://github.com/shenoynikhil/ETFlow.
Open Datasets Yes We conduct our experiments on the GEOM dataset (Axelrod and Gomez-Bombarelli, 2022), which offers curated conformer ensembles produced through meta-dynamics in CREST (Pracht et al., 2024).
Dataset Splits Yes We use a train/validation/test (243473/30433/1000) split as provided in (Ganea et al., 2021)
Hardware Specification Yes For GEOM-DRUGS, we train ET-Flow for a fixed 250 epochs with a batch size of 64 and 5000 training batches per epoch per GPU on 8 A100 GPUs. For GEOM-QM9, we train ET-Flow for 200 epochs with a batch size of 128, and use all of the training dataset per epoch on 4 A100 GPUs.
Software Dependencies No No specific software dependencies with version numbers were listed in the paper.
Experiment Setup Yes For GEOM-DRUGS, we train ET-Flow for a fixed 250 epochs with a batch size of 64 and 5000 training batches per epoch per GPU on 8 A100 GPUs. For the learning rate, we use the Adam Optimizer with a cosine annealing learning rate which goes from a maximum of 10-3 to a minimum 10-7 over 250 epochs with a weight decay of 10-10. For GEOM-QM9, we train ET-Flow for 200 epochs with a batch size of 128, and use all of the training dataset per epoch per epoch on 4 A100 GPUs. We use the cosine annealing learning rate schedule with maximum of 8x10-4 to minimum of 10-7 over 100 epochs, post which the maximum is reduced by a factor of 0.05. We select checkpoints based on the lowest validation error.