Pointer Graph Networks
Authors: Petar Veličković, Lars Buesing, Matthew Overlan, Razvan Pascanu, Oriol Vinyals, Charles Blundell
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our results, summarised in Table 1, clearly indicate outperformance and generalisation of our PGN model, especially on the larger-scale test sets. and Experimental setup As in [47, 55], we evaluate out-of-distribution generalisation training on operation sequences for small input sets (n = 20 entities with ops = 30 operations), then testing on up to 5 larger inputs (n = 50, ops = 75 and n = 100, ops = 150). |
| Researcher Affiliation | Industry | Petar Veliˇckovi c, Lars Buesing, Matthew C. Overlan, Razvan Pascanu, Oriol Vinyals and Charles Blundell Deep Mind {petarv,lbuesing,moverlan,razp,vinyals,cblundell}@google.com |
| Pseudocode | Yes | Figure 2: Pseudocode of DSU operations; initialisation and find(u) (Left), union(u, v) (Middle) and query-union(u, v), giving ground-truth values of ˆy(t) (Right). |
| Open Source Code | Yes | for brevity, we delegate further descriptions of their operations to Appendix C, and provide our C++ implementation of the LCT in the supplementary material. |
| Open Datasets | No | The paper describes generating operations by sampling input node pairs uniformly at random, which forms a custom dataset. No concrete access information (specific link, DOI, repository, or formal citation with authors/year) for a publicly available or open dataset was provided. |
| Dataset Splits | Yes | Experimental setup As in [47, 55], we evaluate out-of-distribution generalisation training on operation sequences for small input sets (n = 20 entities with ops = 30 operations), then testing on up to 5 larger inputs (n = 50, ops = 75 and n = 100, ops = 150). In line with [47], we generate 70 sequences for training, and 35 sequences across each test size category for testing. We perform early stopping, retrieving the model which achieved the best query F1 score on a validation set of 35 small sequences (n = 20, ops = 30). |
| Hardware Specification | No | No specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments were provided. |
| Software Dependencies | No | The paper mentions JAX [2] and Haiku [18] but does not provide specific version numbers for these or any other software components. |
| Experiment Setup | Yes | All models compute k = 32 latent features in each layer, and are trained for 5, 000 epochs using Adam [22] with learning rate of 0.005. We perform early stopping, retrieving the model which achieved the best query F1 score on a validation set of 35 small sequences (n = 20, ops = 30). |