Learning Aggregation Functions

Authors: Giovanni Pellegrini, Alessandro Tibo, Paolo Frasconi, Andrea Passerini, Manfred Jaeger

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present and discuss experimental results showing the potential of the LAF framework on both synthetic and real-world tasks. Synthetic experiments are aimed at showing the ability of LAF to learn a wide range of aggregators and its ability to generalize over set sizes (i.e., having test-set sets whose cardinality exceeds the cardinality of the training-set sets), something that alternative architectures based on predefined aggregators fail to achieve. We use Deep Sets, PNA, and LSTM as representatives of these architectures. The LSTM architecture corresponds to a version of Deep Sets where the aggregation function is replaced by a LSTM layer. Experiments on diverse tasks including point cloud classification, text concept set retrieval and graph properties prediction are aimed at showing the potential of the framework on real-world applications.
Researcher Affiliation Academia 1DISI, University of Trento 2Computer Science Department, Aalborg University 3DINFO, Universit a di Firenze {giovanni.pellegrini, andrea.passerini}@unitn.it, {alessandro, jaeger}@cs.aau.dk, paolo.frasconi@pm.me
Pseudocode No No pseudocode or algorithm blocks are present in the paper.
Open Source Code Yes See https://github.com/alessandro-t/laf for supplementary material and code.
Open Datasets Yes We performed an additional set of experiments aiming to demonstrate the ability of LAF to learn from more complex representations of the data by plugging it into end-to-end differentiable architectures. In these experiments, we thus replaced numbers by visual representations obtained from MNIST digits. [...] In order to evaluate LAF on real-world dataset, we consider point cloud classification, a prototype task for set-wise prediction. Therefore, we run experimental comparisons on the Model Net40 [Wu et al., 2015] dataset, which consists of 9,843 training and 2,468 test point clouds of objects distributed over 40 classes.
Dataset Splits Yes Each synthetic dataset is composed of 100,000 sets for training, 20,000 set for validating and 100,000 for testing.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are mentioned in the paper.
Software Dependencies No No specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) are mentioned in the paper.
Experiment Setup Yes The number of aggregation units is set as follows. The model contains nine LAF (Equation 2) units, whose parameters {ak, . . . , hk}, k = 1, . . . , 9 are initialized according to a uniform sampling in [0, 1] as those parameters must be positive, whereas the coefficients {α, . . . , δ} are initialized with a Gaussian distribution with zero mean and standard deviation of 0.01 to cover also negative values. The positivity constraint for parameters {a, b, . . . , h} is enforced by projection during the optimization process. [...] We use the Mean Absolute Error (MAE) as a loss function to calculate the prediction error. [...] For all the settings, we consider the same architecture and hyper-parameters of the Deep Sets permutation invariant model described by [Zaheer et al., 2017].