Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
A Canonicalization Perspective on Invariant and Equivariant Learning
Authors: George Ma, Yifei Wang, Derek Lim, Stefanie Jegelka, Yisen Wang
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments In this section, we evaluate the expressive power and efficiency of CA on the EXP dataset; we apply our canonicalization to the n-body problem for orthogonal equivariance; we also evaluate OAP for Lap PE on graph regression tasks. |
| Researcher Affiliation | Academia | George Ma 1 Yifei Wang 2 Derek Lim2 Stefanie Jegelka3 Yisen Wang4,5 1 School of EECS, Peking University 2 MIT CSAIL 3 TUM CIT/MCML/MDSI & MIT EECS/CSAIL 4 State Key Lab of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University 5 Institute for Artificial Intelligence, Peking University |
| Pseudocode | Yes | Algorithm 1 Canonicalization for eliminating sign ambiguity of eigenvectors |
| Open Source Code | Yes | Code is available at https://github.com/PKU-ML/canonicalization. |
| Open Datasets | Yes | ZINC [24] (MIT License) consists of 12K molecular graphs from the ZINC database of commercially available chemical compounds. EXP [1] (GPL-3.0 License) is a dataset designed to explicitly evaluate the expressiveness of GNN models |
| Dataset Splits | Yes | The dataset comes with a predefined 10K/1K/1K train/validation/test split. |
| Hardware Specification | Yes | All (preliminary, failed and main) experiments are run on NVIDIA 3090 GPUs with 24GB memory. |
| Software Dependencies | No | The Gram-Schmidt process can be implemented in Py Torch [44] using QR decomposition. |
| Experiment Setup | Yes | The main hyper-parameters in our experiments are listed as follows. k: the number of eigenvectors used in the PE. L1: the number of layers of the base model. h1: the hidden dimension of the base model. h2: the output dimension of the base model. Ξ»: the initial learning rate. t: the patience of the learning rate scheduler. r: the factor of the learning rate scheduler. Ξ»min: the minimum learning rate of the learning rate scheduler. L2: the number of layers of Sign Net or the normal GNN6 (when using canonicalization as PE). h3: the hidden dimension of Sign Net or the normal GNN (when using canonicalization as PE). The values of these hyper-parameters in our experiments are listed in Table 9. |