FSPool: Learning Set Representations with Featurewise Sort Pooling
Authors: Yan Zhang, Jonathon Hare, Adam Prügel-Bennett
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On a toy dataset of polygons and a set version of MNIST, we show that such an auto-encoder produces considerably better reconstructions and representations. Replacing the pooling function in existing set encoders with FSPool improves accuracy and convergence speed on a variety of datasets. |
| Researcher Affiliation | Academia | Yan Zhang University of Southampton Southampton, UK yz5n12@ecs.soton.ac.uk Jonathon Hare University of Southampton Southampton, UK jsh2@ecs.soton.ac.uk Adam Prügel-Bennett University of Southampton Southampton, UK apb@ecs.soton.ac.uk |
| Pseudocode | No | The paper describes algorithms using mathematical equations and prose but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Full results can be found in the appendices, experimental details can be found in Appendix H, and we provide our code for reproducibility at [redacted]. |
| Open Datasets | Yes | Next, we turn to the harder task of auto-encoding MNIST images turned into sets of points... CLEVR (Johnson, 2017) is a visual question answering dataset... We perform a large number of experiments on various graph classification datasets from the TU repository (Kersting et al., 2016)... |
| Dataset Splits | Yes | We repeat 10-fold cross-validation on each dataset 10 times... The best hyperparameters are selected based on best average validation accuracy across the 10-fold cross-validation, where one of the 9 training folds is used as validation set each time. |
| Hardware Specification | Yes | Training the FSPool model takes 45 seconds per epoch on a GTX 1080 GPU, only slightly more than the baselines with 37 seconds per epoch. |
| Software Dependencies | No | The paper mentions 'Py Torch' and 'torch-geometric library (Fey et al., 2018)' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | We use a batch size of 16 for all three models and train it for 10240 steps. We use the Adam optimiser (Kingma & Ba, 2015) with 0.001 learning rate and their suggested values for the other optimiser parameters (Py Torch defaults). Weights of linear and convolutional layers are initialised as suggested in Glorot & Bengio (2010). The size of every hidden layer is set to 16 and the latent space is set to 1 (it should only need to store the rotation as latent variable). |