On the hardness of learning under symmetries
Authors: Bobak Kiani, Thien Le, Hannah Lawrence, Stefanie Jegelka, Melanie Weber
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, Sec. 7 provides a few experiments verifying that the hard classes of invariant functions we propose are indeed difficult to learn. ... We train overparameterized GNNs and CNNs on the hard functions from Sec. 4 and Sec. 5, respectively. ... Fig. 1a and 1b plot the performance of the GNN and CNN respectively. |
| Researcher Affiliation | Academia | Bobak T. Kiani 12, Thien Le 1, Hannah Lawrence 1, Stefanie Jegelka13, Melanie Weber2 1 MIT EECS, 2 Harvard SEAS, 3 TU Munich |
| Pseudocode | No | The paper contains mathematical formulations and proofs, but no explicitly labeled 'Algorithm' or 'Pseudocode' blocks. |
| Open Source Code | No | The paper does not include an unambiguous statement or link indicating that the authors' implementation code for the described methodology is publicly available. |
| Open Datasets | Yes | The GNN is unable to fit even the training data consisting of 225 n = 15 node graphs drawn uniformly from the Erd os R enyi model (i.e., p = 0.5). |
| Dataset Splits | No | The paper mentions 'training data' and a 'test set' in its experimental details, but does not explicitly provide information about a separate validation split or how it was used. |
| Hardware Specification | No | The paper states, 'All experiments were run using Pytorch on a single GPU (Paszke et al., 2019),' which lacks specific hardware details such as the GPU model, CPU, or memory. |
| Software Dependencies | No | The paper mentions 'Pytorch Geometric' and 'Pytorch' but does not specify their version numbers, which are necessary for reproducible software dependencies. |
| Experiment Setup | Yes | The overparameterized GNN used during training consisted of 3 layers of graph convolution followed by a node aggregation average pooling layer and a two layer Re LU MLP with width 64. The graph convolution layers used 32 channels. ... The network was given 10n = 500 training samples and was trained with the Adam optimizer with batch size 32. ... In our experiments, we used the Adam optimizer and tuned the learning rate in the range [0.0001, 0.003]. For CNN experiments, to increase stability of training in later stages, we added a scheduler that divided the learning rate by two every 200 epochs. |