reproducibilityindex.ai

Reinforcement Learning of Theorem Proving

Authors: Cezary Kaliszyk, Josef Urban, Henryk Michalewski, Miroslav Olšák

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The evaluation is done on two datasets of ﬁrst-order problems exported from the Mizar Mathematical Library [13] by the MPTP system [45].
Researcher Affiliation	Collaboration	Cezary Kaliszyk University of Innsbruck Josef Urban Czech Technical University in Prague Henryk Michalewski University of Warsaw, Institute of Mathematics of the Polish Academy of Sciences, deepsense.ai Mirek Olˇs ak Charles University
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. The methods are described in prose.
Open Source Code	No	The paper does not provide concrete access to source code (e.g., a specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described. It mentions the implementation size but not its availability.
Open Datasets	Yes	The evaluation is done on two datasets of ﬁrst-order problems exported from the Mizar Mathematical Library [13] by the MPTP system [45]. The larger Miz40 dataset 5 consists of 32524 problems... 5https://github.com/JUrban/deepmath
Dataset Splits	No	We therefore randomly split Miz40 into a training set of 29272 problems and a testing set of 3252 problems. This describes training and testing splits but does not explicitly mention a separate validation split or how tuning was performed on the M2k dataset regarding splits.
Hardware Specification	Yes	All problems are run on the same hardware6 and with the same memory limits. 6Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz with 256G RAM.
Software Dependencies	No	The paper mentions software components like XGBoost, LIBLINEAR, and Scikit-learn, but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	The ﬁnal values that we use both for policy and value learning are as follows: maximum number of iterations = 400, maximum tree depth = 9, ETA (learning rate) = 0.3, early stopping rounds = 200, lambda (weight regularization) = 1.5.