reproducibilityindex.ai

Faster Smarter Proof by Induction in Isabelle/HOL

Authors: Yutaka Nagashima

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our evaluation based on 1,095 inductive problems from 22 source ﬁles shows that sem ind improves the accuracy of recommendation from 20.1% to 38.2% for the most promising candidates within 5.0 seconds of timeout compared to its predecessor while decreasing the median value of execution time from 2.79 seconds to 1.06 seconds.
Researcher Affiliation	Academia	Yutaka Nagashima Yale-NUS College, National University of Singapore University of Innsbruck Czech Institute of Informatics, Robotics, and Cybernetics, Czech Technical University in Prague yutaka@yale-nus.edu.sg
Pseudocode	No	The paper includes code examples (Program 1, 2, 3) and a flowchart, but no structured pseudocode or algorithm blocks.
Open Source Code	Yes	users only have to download the relevant Isabelle ﬁles from our public Git Hub repository1 and install sem ind using the standard Isabelle command. 1https://github.com/data61/PSL
Open Datasets	Yes	As our evaluation target, we use 22 Isabelle theory ﬁles with 1,095 applications of the induct tactic from the Archive of Formal Proofs (AFP) [Klein et al., 2004].
Dataset Splits	No	The paper states the total number of problems used for evaluation (1,095 inductive problems from 22 theory ﬁles) but does not specify any train/validation/test splits for these problems.
Hardware Specification	Yes	All evaluations are conducted on a Mac Book Pro (15-inch, 2019) with 2.6 GHz Intel Core i7 6-core memory 32 GB 2400 MHz DDR4.
Software Dependencies	No	The paper mentions Isabelle/HOL, Isabelle/jEdit, and Se LFi E, but does not provide specific version numbers for these or any other software dependencies used in the experiments.
Experiment Setup	Yes	sem ind applies 36 pre-deﬁned heuristics written in Se LFi E... sem ind applies 8 pre-deﬁned Se LFi E heuristics to judge the validity of generalisation... Each heuristic is implemented as an assertion on inductive problems and arguments of the induct tactic, and each assertion is tagged with a value... sem ind sums up such points from the 36 heuristics to compute the score for each sequence... selects the ﬁve most promising sequences for further processing... presents the 10 most promising sequences... within 5.0 seconds of timeout.