On the Limitations of Representing Functions on Sets
Authors: Edward Wagstaff, Fabian Fuchs, Martin Engelcke, Ingmar Posner, Michael A. Osborne
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our main contributions can be summarised as follows. 1. Recent proofs, e.g. in Zaheer et al. (2017), consider functions on countable domains. We explain why considering countable domains can lead to results of limited practical value (i.e. cannot be implemented with a neural network), and why considering continuity on uncountable domains such as R is necessary. With reference to neural networks, we ground this discussion in the universal approximation theorem, which relies on continuity on uncountable domains [0, 1]M. 2. In contrast to previous work (Zaheer et al., 2017; Qi et al., 2017a), which considers sufficient conditions for universal function representation, we establish a necessary condition for a sum-decomposition-based model to be capable of universal function representation. Additionally, we provide weaker sufficient conditions which imply a stronger version of universality. Specifically, we show that the dimension of the latent space being at least as large as the maximum number of input elements is both necessary and sufficient for universal function representation. While primarily targeted at neural networks, these results hold for any implementation of sum-decomposition, e.g. using Gaussian processes, as long as it provides universal function approximation for continuous functions. Proofs of all novel results are available in Appendix B. An illustrative toy example is also included with test performance analysis. |
| Researcher Affiliation | Academia | 1Department of Engineering Science, University of Oxford, Oxford, United Kingdom. |
| Pseudocode | No | The paper focuses on theoretical proofs and an illustrative example, but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of open-source code for the methodology described. |
| Open Datasets | No | The paper states: 'The input sets are randomly drawn from either a uniform, a Gaussian, or a Gamma distribution.' These are synthetic distributions and not a publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper mentions 'Test performance' in Figure 3(a) but does not provide specific details on training, validation, and test splits (e.g., percentages or sample counts) needed for reproduction. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper states that MLPs are used for parameterization but does not provide specific software dependencies or version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We vary the latent dimension N and the input set size M to investigate the link between these two variables and the predictive performance. The MLPs parameterising φ and ρ are given comparatively many layers and hidden units, relative to the simplicity of the task, to ensure that the latent dimension is the bottleneck. Further details are described in Appendix D. The input sets are randomly drawn from either a uniform, a Gaussian, or a Gamma distribution. |