Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Hierarchical Relational Representations through Relational Convolutions

Authors: Awni Altabaa, John Lafferty

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we empirically evaluate the proposed relational convolutional network architecture (abbreviated Rel Conv Net) to assess its effectiveness at learning relational tasks. We compare this architecture to several existing relational architectures as well as general-purpose sequence models. [...] Figure 5 reports model performance on the two hold-out object sets after training. [...] Figure 8b shows the hold-out test accuracy for each model.
Researcher Affiliation	Academia	Awni Altabaa EMAIL Department of Statistics & Data Science Yale University John Lafferty EMAIL Department of Statistics & Data Science, Wu Tsai Institute Yale University
Pseudocode	No	The paper describes the architecture and operations using mathematical formulas and descriptive text (e.g., "In this section, we formalize a relational convolution operation...", "Overall, relational convolution with group attention can be summarized as follows: 1) learn ng groupings of objects..."), but it does not present structured pseudocode or algorithm blocks.
Open Source Code	Yes	The project repository can be found here: https://github.com/Awni00/Relational-Convolutions. It includes an implementation of the relational convolutional networks architecture, code and instructions for reproducing our experimental results, and links to experimental logs.
Open Datasets	Yes	The relational games dataset was contributed as a benchmark for relational reasoning by Shanahan et al. (2020), which consists of a family of binary classification tasks for identifying abstract relational rules between a set of objects represented as images.
Dataset Splits	Yes	For the relational games dataset (Section 4.1): "We hold out 1000 samples for validation (during training) and 5000 samples for testing (after training), and use the rest as the training set." For the Set card game task (Section 4.2): "We partition the sets into training (70%), validation (15%), and test (15%) sets."
Hardware Specification	No	The paper does not explicitly state the specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions the use of "Adam optimizer" and refers to "default Tensorflow hyperparameters" in Appendix B, but it does not specify the version numbers for Tensorflow or any other software libraries used.
Experiment Setup	Yes	For Relational Games (Section 4.1): "We train for 50 epochs using the categorical cross-entropy loss and the Adam optimizer with learning rate 0.001, β1 0.9, β2 0.999, ϵ 10 7, and batch size 512." For Set (Section 4.2): "We train for 100 epochs with the same loss, optimizer, and batch size as the experiments in the previous section. For each model, we run 10 trials with different random seeds." and "Rel Conv Net uses the Adam optimizer with the default Tensorflow hyperparameters (constant learning rate of 0.001, β1 0.9, β2 0.999)".