Smooth, exact rotational symmetrization for deep learning on point clouds

Authors: Sergey Pozdnyakov, Michele Ceriotti

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We benchmark PET and the ECSE scheme over six different datasets, which have been previously used in the literature and which allow us to showcase the performance of our framework, and the ease with which it can be adapted to different use cases. The main results, are compared with the state of the art in Figure 3 and Table 1, while in-depth analyses can be found in the Appendix C.
Researcher Affiliation Academia Sergey N. Pozdnyakov and Michele Ceriotti Laboratory of Computational Science and Modelling, Institute of Materials, Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland sergey.pozdnyakov@epfl.ch, michele.ceriotti@epfl.ch
Pseudocode No The paper includes architectural diagrams and mathematical formulations but no explicit pseudocode or algorithm blocks.
Open Source Code Yes In order to facilitate the reproducibility of the results we present, we have released the following assets: 1) the source code for the PET model, 2) the source code for our proof-of-principle implementation of the ECSE protocol, 3) a complete set of hyperparameters for each training procedure, organized as a collection of YAML files, 4) similarly organized hyperparameters for the ECSE, 5) the Singularity container used for most numerical experiments 6) all the checkpoints, including those obtained at intermediate stages of the training procedure, and 7) the datasets we used. This should suffice for the reproducibility of all experiments reported in this manuscript. All these files are available at: https://doi.org/10.5281/zenodo.7967079
Open Datasets Yes We benchmark PET and the ECSE scheme over six different datasets, which have been previously used in the literature and which allow us to showcase the performance of our framework, and the ease with which it can be adapted to different use cases. The main results, are compared with the state of the art in Figure 3 and Table 1, while in-depth analyses can be found in the Appendix C. As a first benchmark, we conduct several experiments with the liquid-water configurations from Ref. 85. This dataset is representative of those used in the construction of interatomic potentials, and presents interesting challenges in that it contains distorted structures from path integral molecular dynamics and involves long-range contributions from dipolar electrostatic interactions. We then move to the realm of small molecules with the COLL dataset[86], that contains distorted configurations of molecules undergoing a collision. In order to assess the accuracy of PET for extreme distortions, and a very wide energy range, we consider the database of CH4 configurations first introduced in Ref. 80. The Mn O dataset of Eckhoff and Behler[88] includes information on the colinear spin of individual atomic sites. Finally, the QM9 dipole dataset[82] allows us to demonstrate how to extend the ECSE to targets that are covariant, rather than invariant.
Dataset Splits Yes We randomly split the dataset into a training subset of 2608 structures, a validation subset of 200 structures, and a testing subset of 293 structures.
Hardware Specification Yes For instance, fitting the PET model with d PET = 128 on the CH4 dataset using 1000 energy-only training samples (the very first point of the learning curve in Fig. 3d) takes about 40 minutes on a V100 GPU. In contrast, fitting the PET model with d PET = 256 on energies and forces using 300,000 samples is estimated to take about 26 GPU-days on a V100 (in practice the model was fitted partially on a V100 and partially on an RTX-4090).
Software Dependencies No The paper mentions 'zero-dimensional Py Torch[77] tensors' and the use of 'Scipy's implementation[98]' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes Unless otherwise specified, we use d PET = 128, n GNN = n T L = 3, multi-head attention with 4 heads, Si LU activation, and the dimension of the feedforward network model in the transformer is set to 512. We use a rotational augmentation strategy during fitting, which involves randomly rotating all samples at each epoch. This is accomplished using Scipy s implementation[98] of a generator that provides uniformly distributed random rotations. We employ Adam[99] optimizer for all cases. For all models with d PET = 128, we set the initial learning rate at 10 4. However, some models with d PET = 256 were unstable at this learning rate. Consequently, we adjusted the initial learning rate to 5 10 5 for two cases: 1) CH4 E+F, 100k samples, and 2) the COLL dataset.