GamePad: A Learning Environment for Theorem Proving

Authors: Daniel Huang, Prafulla Dhariwal, Dawn Song, Ilya Sutskever

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We use Game Pad to synthesize proofs for a simple algebraic rewrite problem and train baseline models for a formalization of the Feit-Thompson theorem. We address position evaluation (i.e., predict the number of proof steps left) and tactic prediction (i.e., predict the next proof step) tasks, which arise naturally in tactic-based theorem proving.
Researcher Affiliation Collaboration Daniel Huang dehuang@berkeley.edu Prafulla Dhariwal prafulla@openai.com Dawn Song dawnsong@cs.berkeley.edu Ilya Sutskever ilyasu@openai.com
Pseudocode No The paper describes the models and methods used, but does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The code for Game Pad, as well as the associated data sets, models and results, are open source on Git Hub at https://github.com/ml4tp/gamepad.
Open Datasets Yes The code for Game Pad, as well as the associated data sets, models and results, are open source on Git Hub at https://github.com/ml4tp/gamepad.
Dataset Splits Yes For our tasks, we split the lemmas in the data set into a training, validation and test set in the ratio 8 : 1 : 1, and ensure that the number of proof states in the splits are in a similar ratio.
Hardware Specification No The paper mentions 'CPU' and 'GPU' in the context of speedups (Table 1), but does not provide specific hardware details such as exact GPU/CPU models or processor types used for experiments.
Software Dependencies No The paper mentions 'Py Torch' as the framework used ('All neural net models were trained using Py Torch'), but does not provide specific version numbers for PyTorch or other ancillary software dependencies.
Experiment Setup Yes We train GRU (Cho et al., 2014) and Tree LSTM (Tai et al., 2015) models using mini-batches of 32 proof states, set the RNN state size to 128, and use the Adam optimizer Kingma & Ba (2015) with a learning rate of 0.001. We use input (Srivastava et al., 2014) and weight (Merity et al., 2017) dropout with a rate of 0.1 for the Tree LSTM models (higher rates led to much slower learning).