Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On Learning Causal Models from Relational Data

Authors: Sanghack Lee, Vasant Honavar

AAAI 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments conﬁrm that RCDℓis substantially more efﬁcient than RCD with respect to its space and time requirements (see Figure 4). RCDℓtakes 70 seconds on average learning an RCM given h = 4 while RCD takes 50 minutes.
Researcher Affiliation	Academia	Sanghack Lee and Vasant Honavar Artiﬁcial Intelligence Research Laboratory College of Information Sciences and Technology The Pennsylvania State University University Park, PA 16802 EMAIL
Pseudocode	Yes	Algorithm 1 RCDℓ: RCD-Light
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets	No	The paper states, 'We generated schemas with 3 entity and 3 binary relationship classes with 2 and 1 attribute classes per entity and relationship class, respectively, with random cardinality. Given the schema, we generated an RCM with 10 dependencies of length up to h and maximum degree of 3.' This indicates synthetic data generation, but no public access information or citation for a dataset is provided.
Dataset Splits	No	The paper describes the generation of synthetic models but does not specify any training, validation, or test dataset splits.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments.
Software Dependencies	No	The paper mentions 'RCDℓ (built on RCD codebase)' but does not specify any software names with version numbers.
Experiment Setup	Yes	We generated schemas with 3 entity and 3 binary relationship classes with 2 and 1 attribute classes per entity and relationship class, respectively, with random cardinality. Given the schema, we generated an RCM with 10 dependencies of length up to h and maximum degree of 3. We followed settings in Maier et al. (2013): (i) RCD uses AGGs whose hop length is limited to 2h for practical reasons; and (ii) AGGs with 2h is adopted as a CI oracle.