Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Aggregation Functions

Authors: Giovanni Pellegrini, Alessandro Tibo, Paolo Frasconi, Andrea Passerini, Manfred Jaeger

IJCAI 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present and discuss experimental results showing the potential of the LAF framework on both synthetic and real-world tasks. Synthetic experiments are aimed at showing the ability of LAF to learn a wide range of aggregators and its ability to generalize over set sizes (i.e., having test-set sets whose cardinality exceeds the cardinality of the training-set sets), something that alternative architectures based on predeﬁned aggregators fail to achieve. We use Deep Sets, PNA, and LSTM as representatives of these architectures. The LSTM architecture corresponds to a version of Deep Sets where the aggregation function is replaced by a LSTM layer. Experiments on diverse tasks including point cloud classiﬁcation, text concept set retrieval and graph properties prediction are aimed at showing the potential of the framework on real-world applications.
Researcher Affiliation	Academia	1DISI, University of Trento 2Computer Science Department, Aalborg University 3DINFO, Universit a di Firenze EMAIL, EMAIL, EMAIL
Pseudocode	No	No pseudocode or algorithm blocks are present in the paper.
Open Source Code	Yes	See https://github.com/alessandro-t/laf for supplementary material and code.
Open Datasets	Yes	We performed an additional set of experiments aiming to demonstrate the ability of LAF to learn from more complex representations of the data by plugging it into end-to-end differentiable architectures. In these experiments, we thus replaced numbers by visual representations obtained from MNIST digits. [...] In order to evaluate LAF on real-world dataset, we consider point cloud classiﬁcation, a prototype task for set-wise prediction. Therefore, we run experimental comparisons on the Model Net40 [Wu et al., 2015] dataset, which consists of 9,843 training and 2,468 test point clouds of objects distributed over 40 classes.
Dataset Splits	Yes	Each synthetic dataset is composed of 100,000 sets for training, 20,000 set for validating and 100,000 for testing.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are mentioned in the paper.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) are mentioned in the paper.
Experiment Setup	Yes	The number of aggregation units is set as follows. The model contains nine LAF (Equation 2) units, whose parameters {ak, . . . , hk}, k = 1, . . . , 9 are initialized according to a uniform sampling in [0, 1] as those parameters must be positive, whereas the coefﬁcients {α, . . . , δ} are initialized with a Gaussian distribution with zero mean and standard deviation of 0.01 to cover also negative values. The positivity constraint for parameters {a, b, . . . , h} is enforced by projection during the optimization process. [...] We use the Mean Absolute Error (MAE) as a loss function to calculate the prediction error. [...] For all the settings, we consider the same architecture and hyper-parameters of the Deep Sets permutation invariant model described by [Zaheer et al., 2017].