Learning to Propagate for Graph Meta-Learning
Authors: LU LIU, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, under different training-test discrepancy and test task generation settings, GPN outperforms recent meta-learning methods on two benchmark datasets. |
| Researcher Affiliation | Academia | 1Center for Artificial Intelligence, University of Technology Sydney 2Paul G. Allen School of Computer Science & Engineering, University of Washington |
| Pseudocode | Yes | Algorithm 1 GPN Training |
| Open Source Code | Yes | The code of GPN and dataset generation is available at https://github.com/liulu112601/Gated-Propagation-Net. |
| Open Datasets | Yes | We built two datasets with different distance/dissimilarity between test classes and training classes, i.e., tiered Image Net-Close and tiered Image Net-Far. ... We extract two datasets from tiered Image Net [22]... The code of GPN and dataset generation is available at https://github.com/liulu112601/Gated-Propagation-Net. |
| Dataset Splits | Yes | The two datasets share the same training tasks and we make sure that there is no overlap between training and test classes. Their difference is at the test classes. In tiered Image Net-Close, the minimal distance between each test class to a training class is 1 4, while the minimal distance goes up to 5 10 in tiered Image Net-Far. The statistics for tiered Image Net-Close and tiered Image Net-Far are reported in Table 2. |
| Hardware Specification | Yes | Our model took approximately 27 hours on one TITAN XP for the 5-way-1-shot learning. |
| Software Dependencies | No | The paper mentions using Adam optimizer and ResNet-08 backbone, but does not provide specific version numbers for software libraries or frameworks (e.g., PyTorch, TensorFlow, Python version) that would be needed for reproduction. |
| Experiment Setup | Yes | The training took τtotal =350k episodes using Adam [12] with an initial learning rate of 10 3 and weight decay 10 5. We reduced the learning rate by a factor of 0.9 every 10k episodes starting from the 20k-th episode. The batch size for the auxiliary task was 128. For simplicity, the propagation steps T = 2. More steps may result in higher performance with the price of more computations. The interval for memory update is m = 3 and the the number of heads is 5 in GPN. |