Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Xeggora: Exploiting Immune-to-Evidence Symmetries with Full Aggregation in Statistical Relational Models

Authors: Mohammad Mahdi Amirian, Saeed Shiry Ghidary

JAIR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This section is composed of two main evaluation reports. In order to design efficient heuristics for FCA as described earlier, the ILP resolution phase was empirically analyzed and evaluated on the artificial data. Furthermore, the whole algorithm was experimented on real benchmark MLNs.
Researcher Affiliation	Academia	Mohammad Mahdi Amirian EMAIL Computer Engineering & Information Technology Department Amirkabir University of Technology, Tehran, Iran Saeed Shiry Ghidary EMAIL Math & Computer Science Department Amirkabir University of Technology, Tehran, Iran
Pseudocode	Yes	Algorithm CHOOSEFORAGGREGATION Input ℱ: a first-order disjunctive clause Input 𝒢: a SQL table, containing all essential ground clauses of ℱ Output appropriate clustering scheme as the target for aggregation 1: L 𝐿𝑖𝑡𝑒𝑟𝑎𝑙𝑠_𝑜𝑓 (ℱ) 2: 𝒱 𝑉𝑎𝑟𝑖𝑎𝑏𝑙𝑒𝑠_𝑜𝑓 (L) 3: if \|L\| = 1 4: candidate_sets { } 5: else 6: candidate_sets {} 7: foreach non-empty 𝑉 𝒱 8: identical_literals {ℓ𝑖 𝐿\| 𝑉𝑎𝑟𝑖𝑎𝑏𝑙𝑒𝑠_𝑜𝑓 (ℓ𝑖) 𝑉} 9: if identical_literals candidate_sets and identical_literals L 10: add identical_literals to candidate_sets 11: if candidate_sets contains any sets of literals with the size of \|L\| 1 12: return best of them greedily to be further first-order aggregated 13: else 14: query SELECT 15: foreach identical_part candidate_sets 16: query query + COUNT (DISTINCT + 𝑆𝑒𝑟𝑖𝑎𝑙𝑖𝑧𝑒 (identical_part) + ) as + 𝐶𝑎𝑝𝑡𝑖𝑜𝑛 (identical_part) [+ , ] // except for the last loop 17: query query + FROM + 𝒢 18: execute query into #clusters 19: best_candidate_sets {argmin#𝑐𝑙𝑢𝑠𝑡𝑒𝑟𝑠 candidate_sets} 20: return argmaxcardinality best_candidate_sets
Open Source Code	Yes	We release the code as an open source project for further investigation5. The source code is available at https://github.com/amirian/xeggora.
Open Datasets	Yes	RC was built for the classification problem on the CORA (Mc Callum, Nigam, Rennie, & Seymore, 2000) dataset. LP performs prediction of the relations holding between UW-CSE students, faculty, and staff (Richardson & Domingos, 2006). IE (Poon & Domingos, 2007) extracts database records from parsed sources. PR contains information on the yeast protein location, function, class, phenotype, and enzymes, from the MIPS (Munich Information center for Protein Sequence) Comprehensive Yeast Genome Database, as of February 2005 (Mewes et al., 2000). ER is used to find records corresponding to the same real-world entity (Singla & Domingos, 2006).
Dataset Splits	No	The paper mentions several benchmark datasets (CORA, UW-CSE, MIPS, EKAW, etc.) and their characteristics in Table 1, such as '# evidence atoms' and '# clauses'. However, it does not explicitly provide details about how these datasets were partitioned into training, validation, or test sets (e.g., specific percentages, absolute counts, or citations to predefined splits for reproducibility).
Hardware Specification	Yes	All experiments were performed on a PC with 8 GB RAM and 4 cores with 2.1 GHz.
Software Dependencies	Yes	In both evaluations, Gurobi4 version 8 was employed as the ILP solver to find an exact or approximate solution based on a gap parameter (bound of relative error).
Experiment Setup	Yes	The gap was set to 10^-6 to reach the exact solution in the experiments it is reachable. For each benchmark with intractable exact inference, we tried various gap ranges to find the best approximation in admissible time. All experiments were performed on a PC with 8 GB RAM and 4 cores with 2.1 GHz. In addition to the solver s gap bound parameter, we set its time limit to stop optimization if the gap is not reached in 10 minutes.