reproducibilityindex.ai

GamePad: A Learning Environment for Theorem Proving

Authors: Daniel Huang, Prafulla Dhariwal, Dawn Song, Ilya Sutskever

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We use Game Pad to synthesize proofs for a simple algebraic rewrite problem and train baseline models for a formalization of the Feit-Thompson theorem. We address position evaluation (i.e., predict the number of proof steps left) and tactic prediction (i.e., predict the next proof step) tasks, which arise naturally in tactic-based theorem proving.
Researcher Affiliation	Collaboration	Daniel Huang dehuang@berkeley.edu Prafulla Dhariwal prafulla@openai.com Dawn Song dawnsong@cs.berkeley.edu Ilya Sutskever ilyasu@openai.com
Pseudocode	No	The paper describes the models and methods used, but does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code for Game Pad, as well as the associated data sets, models and results, are open source on Git Hub at https://github.com/ml4tp/gamepad.
Open Datasets	Yes	The code for Game Pad, as well as the associated data sets, models and results, are open source on Git Hub at https://github.com/ml4tp/gamepad.
Dataset Splits	Yes	For our tasks, we split the lemmas in the data set into a training, validation and test set in the ratio 8 : 1 : 1, and ensure that the number of proof states in the splits are in a similar ratio.
Hardware Specification	No	The paper mentions 'CPU' and 'GPU' in the context of speedups (Table 1), but does not provide specific hardware details such as exact GPU/CPU models or processor types used for experiments.
Software Dependencies	No	The paper mentions 'Py Torch' as the framework used ('All neural net models were trained using Py Torch'), but does not provide specific version numbers for PyTorch or other ancillary software dependencies.
Experiment Setup	Yes	We train GRU (Cho et al., 2014) and Tree LSTM (Tai et al., 2015) models using mini-batches of 32 proof states, set the RNN state size to 128, and use the Adam optimizer Kingma & Ba (2015) with a learning rate of 0.001. We use input (Srivastava et al., 2014) and weight (Merity et al., 2017) dropout with a rate of 0.1 for the Tree LSTM models (higher rates led to much slower learning).