Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Automated Geometry Theorem Proving for Human-Readable Proofs

Authors: Ke Wang, Zhendong Su

IJCAI 2015 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have realized our approach as a new automated geometry theorem prover, i Geo Tutor and extensively evaluated it. Our problem corpus has 77 problems in total, including ones cited in prior work and from Math competitions. Evaluation results show that i Geo Tutor is very effective it solves all but two problems in under two minutes each. We have also performed a test pilot with 12 high school students, and found i Geo Tutor is effective in helping student to learn geometry proof.
Researcher Affiliation Academia Ke Wang Zhendong Su Department of Computer Science University of California, Davis EMAIL
Pseudocode Yes Algorithm 1: Construction search via template matching; Algorithm 2: Template matching procedure
Open Source Code No The paper provides links to a video demo, lists of predicates and axioms, problems and their descriptions, and the two tests for the pilot study, all hosted on a university domain (http://www.cs.ucdavis.edu/ su/igeotutor.html). However, it does not explicitly state that the source code for i Geo Tutor itself is open-source or available at any of these links.
Open Datasets Yes Our test corpus contains 77 problems in total (please refer to http://www.cs.ucdavis.edu/ su/igeotutor.html for the full list of problems and their descriptions). The problems have been gathered from various sources, including 22 from the work on GRAMY [Matsuda and Vanlehn, 2004]6 for comparison. The rest of the problems are from an archive of classical geometry proof problems online [Liu, 2011] and two Chinese textbooks [Shen, 2006; Zhou, 2004], both of which are popular practice materials for the Chinese Mathematics Olympiad.
Dataset Splits No The paper mentions a "test corpus" of 77 problems and a pilot study with pre-tests and post-tests. However, it does not describe specific training, validation, or testing splits for the problems used to evaluate the i Geo Tutor system in terms of percentages or sample counts, which is typical for machine learning models. The system is a theorem prover, not a model trained on data splits.
Hardware Specification Yes i Geo Tutor runs on a workstation with a third generation Intel Core i7-3770 processor and 16GB RAM.
Software Dependencies No In particular, we employ an off-the-shelf state-of-the-art SMT solver Z3 [De Moura and Bjรธrner, 2011; Microsoft Research, ] to resolve the arithmetic constraints when needed. The paper mentions the software Z3 but does not provide a specific version number.
Experiment Setup No The paper describes the system's architecture and logic, including its construction search procedure (Algorithm 1) with a 'depth' parameter, and a 'gain' heuristic for ranking non-minimum sufficient sets. However, it does not specify concrete numerical hyperparameters typically found in machine learning experimental setups, such as learning rates, batch sizes, or optimizer settings, as it is a symbolic reasoning system.