reproducibilityindex.ai

Graph Representations for Higher-Order Logic and Theorem Proving

Authors: Aditya Paliwal, Sarah Loos, Markus Rabe, Kshitij Bansal, Christian Szegedy2967-2974

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This paper presents the ﬁrst use of graph neural networks (GNNs) for higher-order proof search and demonstrates that GNNs can improve upon state-of-the-art results in this domain. Interactive, higher-order theorem provers allow for the formalization of most mathematical theories and have been shown to pose a signiﬁcant challenge for deep learning. Higher-order logic is highly expressive and, even though it is well-structured with a clearly deﬁned grammar and semantics, there still remains no well-established method to convert formulas into graph-based representations. In this paper, we consider several graphical representations of higher-order logic and evaluate them against the HOList benchmark for higher-order theorem proving.
Researcher Affiliation	Industry	Aditya Paliwal Google Research adipal@google.com Sarah M. Loos Google Research smloos@google.com Markus N. Rabe Google Research mrabe@google.com Kshitij Bansal Google Research kbk@google.com Christian Szegedy Google Research szegedy@google.com Google AI Resident
Pseudocode	No	The paper describes the GNN process with mathematical equations and descriptive text, but it does not include a formal 'Pseudocode' or 'Algorithm' block.
Open Source Code	No	The paper does not include an unambiguous statement that the authors are releasing their source code for the methodology described, nor does it provide a direct link to a code repository for their work.
Open Datasets	Yes	HOList (Bansal et al. 2019) is a recently published learning environment for mathematics consisting of a stateless theorem proving API, a benchmark consisting of over twenty thousand mathematical theorems and their proofs, and a neural theorem prover called Deep HOL.
Dataset Splits	Yes	The training set is derived from the proofs of 10,200 top-level theorems and has around 375,000 proof steps (subgoals) to learn from. We have evaluated the end-to-end prover performance on the standard validation split of HOList which consists of 3,225 held out theorems of the complex analysis corpus.
Hardware Specification	Yes	We used eight NVIDIA Tesla V100 GPUs for distributed training, an additional GPU was used purely for evaluation, and we maintained a separate parameter server on a CPU machine.
Software Dependencies	No	The paper mentions software components like 'Adam Optimizer' and 'ReLU activations' but does not provide specific version numbers for these or any other key software dependencies.
Experiment Setup	Yes	For training, we use an Adam Optimizer (Kingma and Ba 2014) with initial learning rate 0.0001 and decay rate 0.98. ... Excluding the two GNNs, dropout is added before every dense layer of the network with a keep probability of 0.7. ... We sample 16 positive (goal, premise) pairs from our training data. ... giving a total batch size of 4,096. ... The token features xt are trainable embedding vectors of size 128. ... The final node embedding has size 128. A dropout of 0.5 is applied to all MLPs. ... Using two Conv 1x1 layers, we expand the 128 dimensional node embeddings to 512, and then 1024, with ReLU activations and a dropout rate of 0.5. ... In our setting, we used k1 = 5, k2 = 20.