Scaling Spherical CNNs
Authors: Carlos Esteves, Jean-Jacques Slotine, Ameesh Makadia
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show our larger spherical CNNs reach state-of-the-art on several targets of the QM9 molecular benchmark, which was previously dominated by equivariant graph neural networks, and achieve competitive performance on multiple weather forecasting tasks. |
| Researcher Affiliation | Collaboration | 1Google Research, New York, NY, USA 2Nonlinear Systems Laboratoty, MIT, Cambridge, MA, USA. |
| Pseudocode | No | No pseudocode or algorithm blocks were explicitly labeled or presented in a structured format. |
| Open Source Code | Yes | Our code is available https://github.com/google-research/spherical-cnn. |
| Open Datasets | Yes | QM9 (Ramakrishnan et al., 2014), a current standard benchmark for this problem, contains 134K molecules... ERA5 reanalysis data (Hersbach et al., 2020) |
| Dataset Splits | Yes | There are two different splits used in the literature, the major difference being that Split 1 uses a training set of 110 000 elements while Split 2 uses 100 000. We train for 2000 epochs on 16 TPUv4 with batch size 16 |
| Hardware Specification | Yes | We train for 2000 epochs on 16 TPUv4 with batch size 16; training runs at around 37 steps/s. We evaluated our model for molecules (Section 5.1) on 8 V100 GPUs, with batch size of 1 per device, and it trains at 13.1 steps/s. |
| Software Dependencies | No | The paper mentions 'Tensor Flow (Abadi et al., 2016)' and 'JAX (Bradbury et al., 2018)' but does not specify their version numbers. |
| Experiment Setup | Yes | We train for 2000 epochs on 16 TPUv4 with batch size 16; training runs at around 37 steps/s. We use the Adam (Kingma & Ba, 2014) optimizer and a cosine decay on the learning rate with one epoch linear warmup in all experiments. |