Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Automated Geometry Theorem Proving for Human-Readable Proofs
Authors: Ke Wang, Zhendong Su
IJCAI 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We have realized our approach as a new automated geometry theorem prover, i Geo Tutor and extensively evaluated it. Our problem corpus has 77 problems in total, including ones cited in prior work and from Math competitions. Evaluation results show that i Geo Tutor is very effective it solves all but two problems in under two minutes each. We have also performed a test pilot with 12 high school students, and found i Geo Tutor is effective in helping student to learn geometry proof. |
| Researcher Affiliation | Academia | Ke Wang Zhendong Su Department of Computer Science University of California, Davis EMAIL |
| Pseudocode | Yes | Algorithm 1: Construction search via template matching; Algorithm 2: Template matching procedure |
| Open Source Code | No | The paper provides links to a video demo, lists of predicates and axioms, problems and their descriptions, and the two tests for the pilot study, all hosted on a university domain (http://www.cs.ucdavis.edu/ su/igeotutor.html). However, it does not explicitly state that the source code for i Geo Tutor itself is open-source or available at any of these links. |
| Open Datasets | Yes | Our test corpus contains 77 problems in total (please refer to http://www.cs.ucdavis.edu/ su/igeotutor.html for the full list of problems and their descriptions). The problems have been gathered from various sources, including 22 from the work on GRAMY [Matsuda and Vanlehn, 2004]6 for comparison. The rest of the problems are from an archive of classical geometry proof problems online [Liu, 2011] and two Chinese textbooks [Shen, 2006; Zhou, 2004], both of which are popular practice materials for the Chinese Mathematics Olympiad. |
| Dataset Splits | No | The paper mentions a "test corpus" of 77 problems and a pilot study with pre-tests and post-tests. However, it does not describe specific training, validation, or testing splits for the problems used to evaluate the i Geo Tutor system in terms of percentages or sample counts, which is typical for machine learning models. The system is a theorem prover, not a model trained on data splits. |
| Hardware Specification | Yes | i Geo Tutor runs on a workstation with a third generation Intel Core i7-3770 processor and 16GB RAM. |
| Software Dependencies | No | In particular, we employ an off-the-shelf state-of-the-art SMT solver Z3 [De Moura and Bjรธrner, 2011; Microsoft Research, ] to resolve the arithmetic constraints when needed. The paper mentions the software Z3 but does not provide a specific version number. |
| Experiment Setup | No | The paper describes the system's architecture and logic, including its construction search procedure (Algorithm 1) with a 'depth' parameter, and a 'gain' heuristic for ranking non-minimum sufficient sets. However, it does not specify concrete numerical hyperparameters typically found in machine learning experimental setups, such as learning rates, batch sizes, or optimizer settings, as it is a symbolic reasoning system. |