Crystal Structure Prediction by Joint Equivariant Diffusion

Authors: Rui Jiao, Wenbing Huang, Peijia Lin, Jiaqi Han, Pin Chen, Yutong Lu, Yang Liu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments In this section, we evaluate the efficacy of Diff CSP on a diverse range of tasks, by showing that it can generate high-quality structures of different crystals in 5.1, with lower time cost comparing with DFT-based optimization method in 5.2. Ablations in 5.3 exhibit the necessity of each designed component. We further showcase the capability of Diff CSP in the ab initio generation task in 5.4.
Researcher Affiliation Academia Rui Jiao1,2 Wenbing Huang3,4 Peijia Lin5 Jiaqi Han6 Pin Chen5 Yutong Lu5 Yang Liu1,2 1Dept. of Comp. Sci. & Tech., Institute for AI, Tsinghua University 2Institute for AIR, Tsinghua University 3Gaoling School of Artificial Intelligence, Renmin University of China 4 Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China 5 National Supercomputer Center in Guangzhou, School of Computer Science and Engineering, Sun Yat-sen University 6 Stanford University
Pseudocode Yes The detailed flowcharts are summarized in Algorithms 1 and 2 in Appendix B.3. Algorithm 1 Training Procedure of Diff CSP. Algorithm 2 Sampling Procedure of Diff CSP.
Open Source Code Yes Code is available at https://github.com/jiaor17/Diff CSP.
Open Datasets Yes We conduct experiments on four datasets with distinct levels of difficulty. Perov-5 [54, 55] contains 18,928 perovskite materials with similar structures. ... Carbon-24 [56] includes 10,153 carbon materials... MP-20 [57] selects 45,231 stable inorganic materials from Material Projects [57]... MPTS-52 is a more challenging extension of MP-20...
Dataset Splits Yes For Perov-5, Carbon-24 and MP-20, we apply the 60-20-20 split in line with Xie et al. [9]. For MPTS-52, we split 27,380/5,000/8,096 for training/validation/testing in chronological order.
Hardware Specification Yes All models are trained on GeForce RTX 3090 GPU.
Software Dependencies No The paper mentions various software tools and libraries like MEGNet [52], Hyperopt [62], scikit-opt, Sch Net [29], Dime Net++ [63], Gem Net-T [64], Transformer [51], pymatgen [58], USPEX [59], VASP [68]. However, it does not provide specific version numbers for these software components to ensure reproducibility of the deep learning experimental setup.
Experiment Setup Yes For our Diff CSP, we utilize the setting of 4 layer, 256 hidden states for Perov-5 and 6 layer, 512 hidden states for other datasets. The dimension of the Fourier embedding is set to k = 256. We apply the cosine scheduler with s = 0.008 to control the variance of the DDPM process on Lt, and an exponential scheduler with σ1 = 0.005, σT = 0.5 to control the noise scale of the score matching process on Ft. The diffusion step is set to T = 1000. Our model is trained for 3500, 4000, 1000, 1000 epochs for Perov-5, Carbon-24, MP-20 and MPTS-52 with the same optimizer and learning rate scheduler as CDVAE.