reproducibilityindex.ai

Efficient Sharpness-Aware Minimization for Molecular Graph Transformer Models

Authors: Yili Wang, Kaixiong Zhou, Ninghao Liu, Ying Wang, Xin Wang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The extensive experiments on six datasets with different tasks demonstrate the superiority of Graph SAM, especially in optimizing the model update process.
Researcher Affiliation	Academia	1School of Artificial Intelligence, Jilin University, China 2Institute for Medical Engineering & Science, Massachusetts Institute of Technology, USA 3School of Computing, University of Georgia, USA 4College of Computer Science and Technology, Jilin University, China
Pseudocode	Yes	Algorithm 1 Graph SAM
Open Source Code	Yes	The code is in: https://github.com/YL-wang/Graph SAM/tree/graphsam.
Open Datasets	Yes	We consider six public benchmark datasets: BBBP, Tox21, Sider, and Clin Tox for the classification task, and ESOL and Lipophilicity for the regression task. We evaluate all models on a random split as suggested by Molecule Net (Wu et al., 2018).
Dataset Splits	Yes	We evaluate all models on a random split as suggested by Molecule Net (Wu et al., 2018), and split the datasets into training, validation, and testing with a 0.8/0.1/0.1 ratio.
Hardware Specification	Yes	All the experiments are implemented by Py Torch, and run on an NVIDIA TITAN-RTX (24G) GPU.
Software Dependencies	No	The paper mentions 'All the experiments are implemented by Py Torch' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	We only adjust the specific hyperparameters introduced by Graph SAM: (1) smoothing parameters of moving average β is tuned within {0.9, 0.99, 0.999}, (2) the initial size of the gradient ball ρ is selected from {0.05, 0.01, 0.005, 0.001}, (3) the ρ s update rate λ is searched over {1, 3, 5}, (4) and the scheduler s modification scale γ = {0.5, 0.2, 0.1}. Due to space limitations, we place our experiments on hyperparameters in Appendix A.6. All the experiments are implemented by Py Torch, and run on an NVIDIA TITAN-RTX (24G) GPU.