Learning Aggregation Functions
Authors: Giovanni Pellegrini, Alessandro Tibo, Paolo Frasconi, Andrea Passerini, Manfred Jaeger
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present and discuss experimental results showing the potential of the LAF framework on both synthetic and real-world tasks. Synthetic experiments are aimed at showing the ability of LAF to learn a wide range of aggregators and its ability to generalize over set sizes (i.e., having test-set sets whose cardinality exceeds the cardinality of the training-set sets), something that alternative architectures based on predefined aggregators fail to achieve. We use Deep Sets, PNA, and LSTM as representatives of these architectures. The LSTM architecture corresponds to a version of Deep Sets where the aggregation function is replaced by a LSTM layer. Experiments on diverse tasks including point cloud classification, text concept set retrieval and graph properties prediction are aimed at showing the potential of the framework on real-world applications. |
| Researcher Affiliation | Academia | 1DISI, University of Trento 2Computer Science Department, Aalborg University 3DINFO, Universit a di Firenze {giovanni.pellegrini, andrea.passerini}@unitn.it, {alessandro, jaeger}@cs.aau.dk, paolo.frasconi@pm.me |
| Pseudocode | No | No pseudocode or algorithm blocks are present in the paper. |
| Open Source Code | Yes | See https://github.com/alessandro-t/laf for supplementary material and code. |
| Open Datasets | Yes | We performed an additional set of experiments aiming to demonstrate the ability of LAF to learn from more complex representations of the data by plugging it into end-to-end differentiable architectures. In these experiments, we thus replaced numbers by visual representations obtained from MNIST digits. [...] In order to evaluate LAF on real-world dataset, we consider point cloud classification, a prototype task for set-wise prediction. Therefore, we run experimental comparisons on the Model Net40 [Wu et al., 2015] dataset, which consists of 9,843 training and 2,468 test point clouds of objects distributed over 40 classes. |
| Dataset Splits | Yes | Each synthetic dataset is composed of 100,000 sets for training, 20,000 set for validating and 100,000 for testing. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are mentioned in the paper. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) are mentioned in the paper. |
| Experiment Setup | Yes | The number of aggregation units is set as follows. The model contains nine LAF (Equation 2) units, whose parameters {ak, . . . , hk}, k = 1, . . . , 9 are initialized according to a uniform sampling in [0, 1] as those parameters must be positive, whereas the coefficients {α, . . . , δ} are initialized with a Gaussian distribution with zero mean and standard deviation of 0.01 to cover also negative values. The positivity constraint for parameters {a, b, . . . , h} is enforced by projection during the optimization process. [...] We use the Mean Absolute Error (MAE) as a loss function to calculate the prediction error. [...] For all the settings, we consider the same architecture and hyper-parameters of the Deep Sets permutation invariant model described by [Zaheer et al., 2017]. |