reproducibilityindex.ai

Pointer Graph Networks

Authors: Petar Veličković, Lars Buesing, Matthew Overlan, Razvan Pascanu, Oriol Vinyals, Charles Blundell

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our results, summarised in Table 1, clearly indicate outperformance and generalisation of our PGN model, especially on the larger-scale test sets. and Experimental setup As in [47, 55], we evaluate out-of-distribution generalisation training on operation sequences for small input sets (n = 20 entities with ops = 30 operations), then testing on up to 5 larger inputs (n = 50, ops = 75 and n = 100, ops = 150).
Researcher Affiliation	Industry	Petar Veliˇckovi c, Lars Buesing, Matthew C. Overlan, Razvan Pascanu, Oriol Vinyals and Charles Blundell Deep Mind {petarv,lbuesing,moverlan,razp,vinyals,cblundell}@google.com
Pseudocode	Yes	Figure 2: Pseudocode of DSU operations; initialisation and find(u) (Left), union(u, v) (Middle) and query-union(u, v), giving ground-truth values of ˆy(t) (Right).
Open Source Code	Yes	for brevity, we delegate further descriptions of their operations to Appendix C, and provide our C++ implementation of the LCT in the supplementary material.
Open Datasets	No	The paper describes generating operations by sampling input node pairs uniformly at random, which forms a custom dataset. No concrete access information (specific link, DOI, repository, or formal citation with authors/year) for a publicly available or open dataset was provided.
Dataset Splits	Yes	Experimental setup As in [47, 55], we evaluate out-of-distribution generalisation training on operation sequences for small input sets (n = 20 entities with ops = 30 operations), then testing on up to 5 larger inputs (n = 50, ops = 75 and n = 100, ops = 150). In line with [47], we generate 70 sequences for training, and 35 sequences across each test size category for testing. We perform early stopping, retrieving the model which achieved the best query F1 score on a validation set of 35 small sequences (n = 20, ops = 30).
Hardware Specification	No	No specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments were provided.
Software Dependencies	No	The paper mentions JAX [2] and Haiku [18] but does not provide specific version numbers for these or any other software components.
Experiment Setup	Yes	All models compute k = 32 latent features in each layer, and are trained for 5, 000 epochs using Adam [22] with learning rate of 0.005. We perform early stopping, retrieving the model which achieved the best query F1 score on a validation set of 35 small sequences (n = 20, ops = 30).