ET-Flow: Equivariant Flow-Matching for Molecular Conformer Generation
Authors: Majdi Hassan, Nikhil Shenoy, Jungyoon Lee, Hannes Stärk, Stephan Thaler, Dominique Beaini
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate ET-Flow by comparing the generated and ground-truth conformers in terms of distance-based RMSD (Section 4.2) and chemical property based metrics (Section 4.4). We present the general experimental setups in Section 4.1. |
| Researcher Affiliation | Collaboration | 1Mila & Université de Montréal 2University of British-Columbia 3Massachusetts Institute of Technology 4Valence Labs |
| Pseudocode | Yes | Algorithm 1: Training procedure, Algorithm 2: Inference procedure, Algorithm 3: Stochastic Sampler |
| Open Source Code | Yes | Code is available https://github.com/shenoynikhil/ETFlow. |
| Open Datasets | Yes | We conduct our experiments on the GEOM dataset (Axelrod and Gomez-Bombarelli, 2022), which offers curated conformer ensembles produced through meta-dynamics in CREST (Pracht et al., 2024). |
| Dataset Splits | Yes | We use a train/validation/test (243473/30433/1000) split as provided in (Ganea et al., 2021) |
| Hardware Specification | Yes | For GEOM-DRUGS, we train ET-Flow for a fixed 250 epochs with a batch size of 64 and 5000 training batches per epoch per GPU on 8 A100 GPUs. For GEOM-QM9, we train ET-Flow for 200 epochs with a batch size of 128, and use all of the training dataset per epoch on 4 A100 GPUs. |
| Software Dependencies | No | No specific software dependencies with version numbers were listed in the paper. |
| Experiment Setup | Yes | For GEOM-DRUGS, we train ET-Flow for a fixed 250 epochs with a batch size of 64 and 5000 training batches per epoch per GPU on 8 A100 GPUs. For the learning rate, we use the Adam Optimizer with a cosine annealing learning rate which goes from a maximum of 10-3 to a minimum 10-7 over 250 epochs with a weight decay of 10-10. For GEOM-QM9, we train ET-Flow for 200 epochs with a batch size of 128, and use all of the training dataset per epoch per epoch on 4 A100 GPUs. We use the cosine annealing learning rate schedule with maximum of 8x10-4 to minimum of 10-7 over 100 epochs, post which the maximum is reduced by a factor of 0.05. We select checkpoints based on the lowest validation error. |