Scaling Spherical CNNs

Authors: Carlos Esteves, Jean-Jacques Slotine, Ameesh Makadia

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show our larger spherical CNNs reach state-of-the-art on several targets of the QM9 molecular benchmark, which was previously dominated by equivariant graph neural networks, and achieve competitive performance on multiple weather forecasting tasks.
Researcher Affiliation Collaboration 1Google Research, New York, NY, USA 2Nonlinear Systems Laboratoty, MIT, Cambridge, MA, USA.
Pseudocode No No pseudocode or algorithm blocks were explicitly labeled or presented in a structured format.
Open Source Code Yes Our code is available https://github.com/google-research/spherical-cnn.
Open Datasets Yes QM9 (Ramakrishnan et al., 2014), a current standard benchmark for this problem, contains 134K molecules... ERA5 reanalysis data (Hersbach et al., 2020)
Dataset Splits Yes There are two different splits used in the literature, the major difference being that Split 1 uses a training set of 110 000 elements while Split 2 uses 100 000. We train for 2000 epochs on 16 TPUv4 with batch size 16
Hardware Specification Yes We train for 2000 epochs on 16 TPUv4 with batch size 16; training runs at around 37 steps/s. We evaluated our model for molecules (Section 5.1) on 8 V100 GPUs, with batch size of 1 per device, and it trains at 13.1 steps/s.
Software Dependencies No The paper mentions 'Tensor Flow (Abadi et al., 2016)' and 'JAX (Bradbury et al., 2018)' but does not specify their version numbers.
Experiment Setup Yes We train for 2000 epochs on 16 TPUv4 with batch size 16; training runs at around 37 steps/s. We use the Adam (Kingma & Ba, 2014) optimizer and a cosine decay on the learning rate with one epoch linear warmup in all experiments.