Graph2Tac: Online Representation Learning of Formal Math Concepts
Authors: Lasse Blaauwbroek, Mirek Olšák, Jason Rute, Fidel Ivan Schaposnik Massolo, Jelle Piepenbrock, Vasily Pestun
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We extensively benchmark two such online solvers implemented in the Tactician platform for the Coq proof assistant: First, Tactician s online k-nearest neighbor solver, which can learn from recent proofs, shows a 1.72 improvement in theorems proved over an offline equivalent. |
| Researcher Affiliation | Collaboration | 1IHES 2University of Cambridge 3IBM Research 4Radboud University 5Czech Technical University in Prague |
| Pseudocode | No | The paper describes the model architecture and its operational steps (e.g., Figure 4, Section 3.2, Appendix E) using textual descriptions and mathematical formulas, but it does not include a formally labeled 'Pseudocode' block or 'Algorithm' listing. |
| Open Source Code | Yes | Our code is publicly available, and every step of our work is fully reproducible (Appendix A). ... Graph2Tac: Code to train and evaluate the GNN.6. Github: https://github.com/IBM/graph2tac |
| Open Datasets | Yes | In this paper, we use a freely available dataset of Coq Data (Blaauwbroek, 2023). Information on the construction of the dataset can be found in Blaauwbroek (2024) and Blaauwbroek et al. (2024). |
| Dataset Splits | No | The paper states 'We divide packages into training and test where no test package depends on a training package. To do so, we induced a random topological order on the Coq packages, with regard to the dependency graph. The resulting list was then split such that the average percentage of theorems and of proof states in the training split is close to 90% (in our case, it is 91.3% of all theorems and 88.8% of all proof states).' It also mentions monitoring 'validation accuracy' in Appendix F, but it does not specify a precise percentage or count for a validation split. |
| Hardware Specification | Yes | Graph2Tac graph models were trained on 8 CPUs (with 128 GB RAM) and 2 NVIDIA A100 GPU cards for about 20 days. ... The GPU evaluations and the training of the transformer were performed on machines with two 20-core Inter E52698v4, a working memory of 512 GB RAM, and 8 NVIDIA Tesla V100 GPU cards. |
| Software Dependencies | No | The paper mentions using 'ragged tensors in Tensor Flow (Abadi et al., 2015)' and 'graph tensors in TF-GNN (Ferludin et al., 2022)'. It also states 'The transformer was a GPT2-style model (Radford et al., 2019) from the Huggingface Transformers library.' However, it does not provide specific version numbers for TensorFlow, TF-GNN, or the Huggingface Transformers library. |
| Experiment Setup | Yes | We train on batches of 512 definitions and 512 proof states. ... Optimization used the Adam optimizer with a learning rate of 0.0003 and L2 regularization set to 1.0e-05. |