3D Structure Prediction of Atomic Systems with Flow-based Direct Preference Optimization
Authors: Rui Jiao, Xiangzhe Kong, Wenbing Huang, Yang Liu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on antibodies and crystals demonstrate substantial benefits of our Flow DPO, highlighting its potential to advance the field of 3D structure prediction with generative models.4 Experiments We validate our method on two distinct domains: antibody structure prediction ( 4.1) and crystal structure prediction ( 4.2). |
| Researcher Affiliation | Academia | Rui Jiao1,2 Xiangzhe Kong1,2 Wenbing Huang3,4 Yang Liu1,2 1Dept. of Comp. Sci. & Tech., Institute for AI, Tsinghua University 2Institute for AIR, Tsinghua University 3Gaoling School of Artificial Intelligence, Renmin University of China 4 Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China |
| Pseudocode | Yes | Algorithm 1 Candidate Generation |
| Open Source Code | Yes | Our codes are available at https://github.com/jiaor17/Flow DPO. |
| Open Datasets | Yes | Following previous literature [20], we extract antibody structures from the SAb Dab database [8] for training and utilize the manually curated test set from Diff Ab [20].We conduct the crystal structure prediction task on three datasets in line with previous works [33, 14]. Perov-5 [4] includes 18,928 perovskite crystals... MP-20 [13] comprises 45,231 materials from the Materials Project... |
| Dataset Splits | Yes | The dataset is then split into training and validation sets at a 9:1 ratio based on the clusters.For Perov-5 and MP-20, we maintain the conventional 60-20-20 split for training, validation, and testing. For the MPTS-52 dataset, we use a chronological split, assigning 27,380 crystals for training, 5,000 for validation, and 8,096 for testing. |
| Hardware Specification | Yes | All experiments can be run on one Ge Force RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions software like 'mmseqs2' [29] and 'pymatgen' [22] but does not specify their version numbers or other software dependencies with specific version information necessary for replication. |
| Experiment Setup | Yes | Detailed hyperparameters for our Flow DPO are presented in Table 4.The detailed hyperparameters for the Flow DPO pipeline on each crystal dataset are provided in Table 5. |