A probability contrastive learning framework for 3D molecular representation learning

Authors: Jiayu Qin, Jian Chen, Rohan Sharma, Jingchen Sun, Changyou Chen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results indicate that our method outperforms existing approaches in 13 out of 15 molecular property prediction benchmarks in Molecule Net dataset and 8 out of 12 benchmarks in the QM9 benchmark, achieving new state-of-the-art results on average. We evaluate our method on molecular property prediction tasks.
Researcher Affiliation Academia Jiayu Qin University at Buffalo jiayuqin@buffalo.edu Jian Chen University at Buffalo jchen378@buffalo.edu Rohan Sharma University at Buffalo rohanjag@buffalo.edu Jingchen Sun University at Buffalo jsun39@buffalo.edu Changyou Chen University at Buffalo changyou@buffalo.edu
Pseudocode Yes Algorithm 1 Contrastive Learning with Stochastic EM
Open Source Code No The paper states "The model implementation is based on Py Torch Geometric." but does not provide a direct link or explicit statement for the open-source release of their specific methodology's code.
Open Datasets Yes Molecule Net [35] is a popular benchmark for molecular property prediction... The QM9 dataset [28] is another popular dataset in molecular property prediction... [35] Zhenqin Wu, Bharath Ramsundar, Evan N Feinberg, et al. Molecule Net: a benchmark for molecular machine learning . In: Chemical Science 9.2 (2018), pp. 513 530. [28] Raghunathan Ramakrishnan, Pavlo O Dral, Matthias Rupp, et al. Quantum chemistry structures and properties of 134 kilo molecules . In: Scientific data 1.1 (2014), pp. 1 7.
Dataset Splits Yes The data partition we use has 110k,10k,and 11k molecules in training, validation and testing sets. We split all the datasets with scaffold split, which splits molecules according to their molecular substructure. For most datasets, we use a scaffold-based data split; however, the QM9 subtask follows a random split in line with the Mol-CLR methodology. ...hyperparameters tuned via random search on validation sets, and results reported on test sets.
Hardware Specification Yes Molecular pretraining runs on 4 A6000 GPUs, and the training time is about 48 hours. All models were trained on a single A6000 GPU, with mixed-precision tasks requiring 81 GPU-hours and single-precision tasks requiring 151 GPU-hours.
Software Dependencies No The paper states "The model implementation is based on Py Torch Geometric." but does not provide a specific version number for PyTorch Geometric or any other software dependency.
Experiment Setup Yes For all experiments, we provide detailed experiment settings in Appendix C. Following Unimol, we report the detailed hyperparameters setup of during pretraining in 7. Molecular pretraining runs on 4 A6000 GPUs, and the training time is about 48 hours. We split all the datasets with scaffold split, which splits molecules according to their molecular substructure. The model undergoes pre-training over 50 epochs with a batch size of 512, optimized via the Adam optimizer with an initial learning rate of 5 10 4 and a weight decay rate of 1 10 5. A cosine learning rate decay schedule is applied throughout pre-training. For tasks predicting µ, α, εHOMO, εLUMO, ε, and Cν, our configuration includes a batch size of 64, 300 training epochs, a learning rate of 5 10 4, and Gaussian radial basis functions with 128 bases. The architecture comprises six Transformer blocks, a weight decay of 5 10 3, and a dropout rate of 0.2. Mixed-precision training is employed for these tasks.