reproducibilityindex.ai

Mathematical Reasoning in Latent Space

Authors: Dennis Lee, Christian Szegedy, Markus Rabe, Sarah Loos, Kshitij Bansal

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We design and conduct a simple experiment to study whether neural networks can perform several steps of approximate reasoning in a ﬁxed dimensional latent space. Our experiments show that graph neural networks can make non-trivial predictions about the rewrite-success of statements, even when they propagate predicted latent representations for several steps.
Researcher Affiliation	Industry	Dennis Lee, Christian Szegedy, Markus N. Rabe, Kshitij Bansal, and Sarah M. Loos Google Research Mountain View, CA, USA {ldennis,szegedy,mrabe,kbk,smloos}@google.com
Pseudocode	No	The paper describes methods textually and with a diagram (Figure 1 and the schema on page 4) but does not include formal pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement or link indicating that the source code for the methodology described in this paper is publicly available.
Open Datasets	Yes	We generate the training data from the theorem database of the HOList environment (Bansal et al., 2019b), which contains 19591 theorems and deﬁnitions ordered such that later theorems use only earlier statements in their human proof.
Dataset Splits	Yes	The theorems are split into 11655 training, 3668 validation, and 3620 testing theorems.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their versions) that would be needed to replicate the experiment.
Experiment Setup	Yes	Our network has two independent towers, which are each 16-hop graph neural networks with internal node representations in R128. The output of each of the two towers is fed into a layer that expands the dimension of the node representation to R1024 with a fully connected layer with shared weights for each node. [...] We add Gaussian noise with β = 10 3 to the embedding of the goal theorem. [...] and are processed by a three layer perceptron with 1024 hidden units and rectiﬁed linear activation between the layers. Our model is trained with g = 16 choices of T in each batch.