FEW-SHOT LEARNING ON GRAPHS VIA SUPER-CLASSES BASED ON GRAPH SPECTRAL MEASURES
Authors: Jatin Chauhan, Deepak Nathani, Manohar Kaul
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct exhaustive empirical evaluations of our proposed method and show that it outperforms both the adaptation of state-of-the-art graph classification methods to few-shot scenario and our naive baseline GNNs. Additionally, we also extend and study the behavior of our method to semi-supervised and active learning scenarios. |
| Researcher Affiliation | Academia | Jatin Chauhan, Deepak Nathani, Manohar Kaul Department of Computer Science Indian Institute of Technology Hyderabad {chauhanjatin100,deepakn1019,manohar.kaul}@gmail.com |
| Pseudocode | No | The paper describes the approach textually and mathematically but does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our source-code and dataset splits have been made public in an attempt to attract more attention to the context of few-shot learning on graphs. |
| Open Datasets | Yes | We use 4 different datasets namely Reddit-12K, ENZYMES, Letter-High and TRIANGLES to perform exhaustive empirical evaluation of our model on various real-world datasets varying from small average graph size on Letter-High to large graphs like Reddit-12K. These datasets can be downloaded here https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets |
| Dataset Splits | Yes | The validation graphs are used to assess model performance on training classes itself to check overfitting as well as for grid-search over hyperparameters. The actual train-testing class splits used for this paper are provided with the code. Since the TRIANGLES dataset has a large number of samples, this makes it infeasible to run many baselines including DL and non-DL methods. Hence, we sample 200 graphs from each class, making the total sample size 2000. Similarly we downsample the number of graphs from 11929 to 1111 (nearly 101 graphs per class). Downsampling is performed for Reddit-12K given extremely large graph sizes which makes the graph kernels as well as some deep learning baselines extremely slow. Table 7: Dataset Splits: Dataset Name # Train Classes # Test Classes # Training Graphs # Validation Graphs # Test Graphs |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU/CPU models or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Python Optimal Transport (POT) library' but does not provide a specific version number. It also mentions 'Adam (Kingma & Ba (2014))' as an optimizer but this is not a software dependency with a version. |
| Experiment Setup | Yes | The number of super-classes are selected from the set {1, 2, 3, 4, 5} using grid-search. The k-value for construction of super-graph was selected from the set {2, 4, 6, 8}. The feature extractor model uses batch-normalization between subsequent message passing layers. We use dropout of 0.5 in the Csup layers. The CGAT layers undergo normalization of inputs between subsequent layers along with a dropout of 0.5, however, the normalization mechanism in classifier layers is different from batch-norm. We normalize each feature embedding to have Euclidean norm with value 1. We train our models with Adam (Kingma & Ba (2014)) with an initial learning rate of 10 3 for 50 epochs. Each epoch has 10 iterations, where we randomly select a mini-batch from the training data GB. The fine-tuning stage consists of 20 epochs with 10 iterations per epoch. We use a two-layer MLP over the final attention layer of CGAT for classification. The attention layers use multi-head attention with 2 heads and leaky Re LU slope of 0.1 . The embeddings from both the attention heads are concatenated. For 20-shot, we set k to 2, number of super-classes to 3 and batch size to 128 on the Letter-High dataset, while k is set to 2 and batch size 64 on Reddit, ENZYMES and TRIANGLES datasets. The number of super-classes for Reddit are set to 2, for ENZYMES it is set to 1 and for TRIANGLES are 3. |