Premise Selection for Theorem Proving by Deep Graph Embedding
Authors: Mingzhe Wang, Yihe Tang, Jian Wang, Jia Deng
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform experiments using the Hol Step dataset [10], which consists of over two million conjecturestatement pairs that can be used to evaluate premise selection. The results show that our graphembedding approach achieves large improvement over sequence-based models. In particular, our approach improves the state-of-the-art accuracy on Hol Step by 7.3%. |
| Researcher Affiliation | Academia | Mingzhe Wang Yihe Tang Jian Wang Jia Deng University of Michigan, Ann Arbor |
| Pseudocode | No | The paper describes methods and equations but does not contain a dedicated pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We evaluate our approach on the Hol Step dataset [10], a recently introduced benchmark for evaluating machine learning approaches for theorem proving. It was constructed from the proof trace files of the HOL Light theorem prover [7] on its multivariate analysis library [48] and the formal proof of the Kepler conjecture. |
| Dataset Splits | Yes | The dataset contains 11,410 conjectures, including 9,999 in the training set and 1,411 in the test set. There are 2,209,076 conjecture-statement pairs in total. We hold out 700 conjectures from the training set as the validation set to tune hyperparameters and perform ablation analysis. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions RMSProp as an optimizer but does not specify versions for software libraries or dependencies. |
| Experiment Setup | Yes | We train our networks using RMSProp [50] with 0.001 learning rate and 1 × 10−4 weight decay. We lower the learning rate by 3X after each epoch. We train all models for five epochs and all networks converge after about three or four epochs. |