reproducibilityindex.ai

How to transfer algorithmic reasoning knowledge to learn new algorithms?

Authors: Louis-Pascal Xhonneux, Andreea-Ioana Deac, Petar Veličković, Jian Tang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To investigate this empirically we create a dataset including 9 algorithms and 3 different graph types. We validate this empirically and show how instead multi-task learning can be used to achieve the transfer of algorithmic reasoning knowledge.
Researcher Affiliation	Collaboration	Louis-Pascal A. C. Xhonneux Université de Montréal Mila xhonneul@mila.quebec Andreea Deac Université de Montréal Mila deacandr@mila.quebec Petar Veliˇckovi c Deep Mind, London UK petarv@google.com Jian Tang HEC Montréal Mila jian.tang@hec.ca
Pseudocode	Yes	Algorithm 1 Parallel Input: graph G, weights w, source index i initialise_nodes(G.vertices, i) repeat for (u, v) G.edges() do relax_edge(u, v, w) end for until none of the nodes change
Open Source Code	No	All code to generate data and train models will be released upon acceptance with an MIT license.
Open Datasets	No	To investigate this empirically we create a dataset including 9 algorithms and 3 different graph types.
Dataset Splits	No	We train using ADAM [28] with a learning rate of 0.0005, a batch size of 64, and use early stopping with a patience of 10 to prevent overﬁtting. We test on graphs size 20, 50, and 100 nodes.
Hardware Specification	Yes	Each experiment was executed on a V100 GPU in less than 5 hours for the longest experiment.
Software Dependencies	No	The paper mentions using ADAM [28] as the optimizer but does not specify version numbers for any software dependencies like programming languages or libraries.
Experiment Setup	Yes	For all experiments we use 5,000 graphs of each type (Erdos-Renyi (ER), Barabasi-Albert (BA), 2d-Grids (2d-G)) with 20 nodes each. We train using ADAM [28] with a learning rate of 0.0005, a batch size of 64, and use early stopping with a patience of 10 to prevent overﬁtting. We test on graphs size 20, 50, and 100 nodes. The hidden embedding size is set to 32 except for NE++ for multi-task experiments, where it is 16 to account for the additional expressivity of having several encoders.