Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On Learning Causal Models from Relational Data
Authors: Sanghack Lee, Vasant Honavar
AAAI 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments conļ¬rm that RCDāis substantially more efļ¬cient than RCD with respect to its space and time requirements (see Figure 4). RCDātakes 70 seconds on average learning an RCM given h = 4 while RCD takes 50 minutes. |
| Researcher Affiliation | Academia | Sanghack Lee and Vasant Honavar Artiļ¬cial Intelligence Research Laboratory College of Information Sciences and Technology The Pennsylvania State University University Park, PA 16802 EMAIL |
| Pseudocode | Yes | Algorithm 1 RCDā: RCD-Light |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | No | The paper states, 'We generated schemas with 3 entity and 3 binary relationship classes with 2 and 1 attribute classes per entity and relationship class, respectively, with random cardinality. Given the schema, we generated an RCM with 10 dependencies of length up to h and maximum degree of 3.' This indicates synthetic data generation, but no public access information or citation for a dataset is provided. |
| Dataset Splits | No | The paper describes the generation of synthetic models but does not specify any training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper mentions 'RCDā (built on RCD codebase)' but does not specify any software names with version numbers. |
| Experiment Setup | Yes | We generated schemas with 3 entity and 3 binary relationship classes with 2 and 1 attribute classes per entity and relationship class, respectively, with random cardinality. Given the schema, we generated an RCM with 10 dependencies of length up to h and maximum degree of 3. We followed settings in Maier et al. (2013): (i) RCD uses AGGs whose hop length is limited to 2h for practical reasons; and (ii) AGGs with 2h is adopted as a CI oracle. |