GamePad: A Learning Environment for Theorem Proving
Authors: Daniel Huang, Prafulla Dhariwal, Dawn Song, Ilya Sutskever
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We use Game Pad to synthesize proofs for a simple algebraic rewrite problem and train baseline models for a formalization of the Feit-Thompson theorem. We address position evaluation (i.e., predict the number of proof steps left) and tactic prediction (i.e., predict the next proof step) tasks, which arise naturally in tactic-based theorem proving. |
| Researcher Affiliation | Collaboration | Daniel Huang dehuang@berkeley.edu Prafulla Dhariwal prafulla@openai.com Dawn Song dawnsong@cs.berkeley.edu Ilya Sutskever ilyasu@openai.com |
| Pseudocode | No | The paper describes the models and methods used, but does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code for Game Pad, as well as the associated data sets, models and results, are open source on Git Hub at https://github.com/ml4tp/gamepad. |
| Open Datasets | Yes | The code for Game Pad, as well as the associated data sets, models and results, are open source on Git Hub at https://github.com/ml4tp/gamepad. |
| Dataset Splits | Yes | For our tasks, we split the lemmas in the data set into a training, validation and test set in the ratio 8 : 1 : 1, and ensure that the number of proof states in the splits are in a similar ratio. |
| Hardware Specification | No | The paper mentions 'CPU' and 'GPU' in the context of speedups (Table 1), but does not provide specific hardware details such as exact GPU/CPU models or processor types used for experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch' as the framework used ('All neural net models were trained using Py Torch'), but does not provide specific version numbers for PyTorch or other ancillary software dependencies. |
| Experiment Setup | Yes | We train GRU (Cho et al., 2014) and Tree LSTM (Tai et al., 2015) models using mini-batches of 32 proof states, set the RNN state size to 128, and use the Adam optimizer Kingma & Ba (2015) with a learning rate of 0.001. We use input (Srivastava et al., 2014) and weight (Merity et al., 2017) dropout with a rate of 0.1 for the Tree LSTM models (higher rates led to much slower learning). |