Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning MDL Logic Programs from Noisy Data

Authors: Céline Hocquette, Andreas Niskanen, Matti Järvisalo, Andrew Cropper

AAAI 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments on several domains, including drug design, game playing, and program synthesis, show that our approach can outperform existing approaches in terms of predictive accuracies and scale to moderate amounts of noise.
Researcher Affiliation	Academia	Celine Hocquette1, Andreas Niskanen2, Matti J arvisalo2, Andrew Cropper1 1University of Oxford 2University of Helsinki EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: MAXSYNTH 1 def maxsynth(bk, pos, neg): 2 cons, promising, best_solution = {}, {}, {} 3 size, max_mdl = 1, len(pos) 4 while size max_mdl: 5 h = generate(cons, size) 6 if h == UNSAT: 7 size += 1 8 continue 9 tp, fn, fp = test(pos, neg, bk, h) 10 h_mdl = fn+fp+size(h) 11 if h_mdl < max_mdl: 12 best_solution = h 13 max_mdl = h_mdl-1 14 if tp>0 and not_rec(h) and not_pi(h): 15 promising += h 16 combi = combine(promising, max_mdl) 17 if combi != UNSAT: 18 best_solution = combi 19 tp, fn, fp = test(pos, neg, bk, combi) 20 max_mdl = fn+fp+size(combi)-1 21 cons += constrain(h, fn, fp) 22 return best_solution
Open Source Code	Yes	The experimental code and data are available at https://github.com/ celinehocquette/aaai24-maxsynth.
Open Datasets	Yes	IGGP. The goal of inductive general game playing (Cropper, Evans, and Law 2020) (IGGP) is to induce rules to explain game traces from the general game playing competition (Genesereth and Bj ornsson 2013). Program synthesis. We use a program synthesis dataset (Cropper and Morel 2021). Zendo. Zendo is an inductive game where the goal is to find a rule by building structures of pieces. The game interests cognitive scientists (Bramley et al. 2018). Alzheimer. These real-world tasks (King, Sternberg, and Srinivasan 1995) involve learning rules describing four properties desirable for drug design against Alzheimer s disease. Wn18RR. Wn18rr (Bordes et al. 2013) is a real-world knowledge base with 11 relations from Word Net.
Dataset Splits	No	The paper mentions evaluating on "training examples" and "unseen test data" and adding noise to "training examples," but it does not specify explicit dataset split ratios or methodologies (e.g., 80/20 split, random seed, k-fold cross-validation).
Hardware Specification	Yes	We use an 8-Core 3.2 GHz Apple M1 and a single CPU.
Software Dependencies	No	The paper states "MAXSYNTH uses the UWr Max Sat solver (Piotr ow 2020) in the combine stage" and "POPPER uses Clingo (Gebser et al. 2014)," but it does not provide explicit version numbers for these software dependencies or any other ancillary software.
Experiment Setup	No	The paper states "We measure predictive accuracy... and learning time given a maximum learning time of 20 minutes," but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed model training configurations.