reproducibilityindex.ai

CrysGNN: Distilling Pre-trained Knowledge to Enhance Property Prediction for Crystalline Materials

Authors: Kishalay Das, Bidisha Samanta, Pawan Goyal, Seung-Cheol Lee, Satadeep Bhattacharjee, Niloy Ganguly

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments to show that with distilled knowledge from the pre-trained model, all the SOTA algorithms are able to outperform their own vanilla version with good margins.
Researcher Affiliation	Collaboration	Kishalay Das1, Bidisha Samanta1, Pawan Goyal1, Seung-Cheol Lee2, Satadeep Bhattacharjee2, Niloy Ganguly1,3 1 Indian Institute of Technology Kharagpur, India. 2 Indo Korea Science and Technology Center, Bangalore, India. 3 L3S, Leibniz University of Hannover, Germany.
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	We have released the pre-trained model along with the large dataset of 800K crystal graph which we carefully curated; so that the pretrained model can be plugged into any existing and upcoming models to enhance their prediction accuracy. ... Source code, pre-trained model, and dataset of Crys GNN is made available at https://github.com/kdmsit/crysgnn
Open Datasets	Yes	To this effect, we curate a new large untagged crystal dataset with 800K crystal graphs and undertake a pre-training framework (named Crys GNN) with the dataset. ... We have released the pre-trained model along with the large dataset of 800K crystal graph which we carefully curated; so that the pretrained model can be plugged into any existing and upcoming models to enhance their prediction accuracy. ... Source code, pre-trained model, and dataset of Crys GNN is made available at https://github.com/kdmsit/crysgnn
Dataset Splits	Yes	For each property, we trained on 80% data, validated on 10% and evaluated on 10% of the data.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) needed to replicate the experiment.
Experiment Setup	Yes	We train Pψ using dataset Dt to optimize the following multitask loss: Lprop = δLMSE + (1 δ)LKD (3) ... Finally, δ signifies relative weightage between two losses, which is a hyper-parameter to be tuned on validation data.