Learning to Group Auxiliary Datasets for Molecule

Authors: Tinglin Huang, Ziniu Hu, Rex Ying

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experiments demonstrate the efficiency and effectiveness of Mol Group, showing an average improvement of 4.41%/3.47% for GIN/Graphormer trained with the group of molecule datasets selected by Mol Group on 11 target molecule datasets.
Researcher Affiliation Academia Tinglin Huang1 Ziniu Hu2 Rex Ying1 1Yale University, 2University of California, Los Angeles
Pseudocode Yes The pseudo-code is presented in Algo.1.
Open Source Code Yes Source code is available at https://github.com/Graph-and-Geometric-Learning/Mol Group.
Open Datasets Yes Our study utilizes 15 molecule datasets of varying sizes obtained from Molecule Net [51, 18] and Chem BL [33], which can be categorized into three groups: medication, quantum mechanics, and chemical analysis. All the involved datasets can be accessed and downloaded from OGB4 or Molecule Net repository5. (Footnote 4: https://ogb.stanford.edu/, Footnote 5: https://moleculenet.org/)
Dataset Splits Yes We follow the original split setting, where qm8 and qm9 are randomly split, and scaffold splitting is used for the others.
Hardware Specification Yes The experiments are conducted on a single Linux server with The Intel Xeon Gold 6240 36-Core Processor, 361G RAM, and 4 NVIDIA A100-40GB.
Software Dependencies Yes Our method is implemented on Py Torch 1.10.0 and Python 3.9.13.
Experiment Setup Yes As for GIN [53], we fix the batch size as 128 and train the model for 50 epochs. We use Adam [24] with a learning rate of 0.001 for optimization. The hidden size and number of layers are set as 300 and 5 respectively. We set the dropout rate as 0.5 and apply batchnorm [21] in each layer. All the results are reported after 5 different random seeds. As for Graphormer [57], we fix the batch size as 128 and train the model for 30 epochs. Adam W [31] with a learning rate of 0.0001 is used as the optimizer. The hidden size, number of layers, and number of attention heads are set as 512, 5, and 8 respectively. We set the dropout rate and attention dropout rate as 0.1 and 0.3.