reproducibilityindex.ai

Transfer learning for atomistic simulations using GNNs and kernel mean embeddings

Authors: John Falk, Luigi Bonati, Pietro Novelli, Michele Parrinello, Massimiliano Pontil

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test our approach on a series of realistic datasets of increasing complexity, showing excellent generalization and transferability performance, and improving on methods that rely on GNNs or ridge regression alone, as well as similar fine-tuning approaches.
Researcher Affiliation	Collaboration	John I. Falk CSML Istituto Italiano di Tecnologia Genova, Italy me@isakfalk.com Luigi Bonati Atomistic Simulations Istituto Italiano di Tecnologia Genova, Italy luigi.bonati@iit.it Pietro Novelli CSML Istituto Italiano di Tecnologia Genova, Italy pietro.novelli@iit.it Michele Parrinello Atomistic Simulations Istituto Italiano di Tecnologia Genova, Italy michele.parrinello@iit.it Massimiliano Pontil CSML Istituto Italiano di Tecnologia Genova, Italy University College London, U.K. massimiliano.pontil@iit.it
Pseudocode	Yes	In Algorithm 1 we report the pseudo-code describing our implementation of the training and prediction steps of MEKRR.
Open Source Code	Yes	We make the code repository available at https://github.com/IsakFalk/atomistic_transfer_mekrr.
Open Datasets	Yes	OC20 The Open Catalyst (OC) 20 is a large dataset of ab initio calculations aimed at estimating adsorption energies on catalytic surfaces. It comprises 250 millions of DFT calculations, generated from over 1.2 million relaxations trajectories of different combinations of molecules and surfaces.
Dataset Splits	Yes	We split all the below datasets into a train, validation, and test set using random splitting of 60/20/20.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory).
Software Dependencies	No	The paper mentions software used, such as Sch Net and SCN codebase from [22], and QUIP/quippy code base [61, 62] for GAP, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	The length-scale of the Gaussian kernel is chosen according to the median heuristic [63]. We will denote MEKRR-(Sch Net) and MEKRR-(SCN) the variants using Schnet and SCN node features as inputs, respectively. [...] To initially fit the regularization parameter λ we set α = 0 and cross-validate λ {10 3, . . . , 10 9} using the same datasets.