Faster Attend-Infer-Repeat with Tractable Probabilistic Models
Authors: Karl Stelzner, Robert Peharz, Kristian Kersting
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4. Experiments In this section we compare Su PAIR to the original AIR system (Eslami et al., 2016), and investigate the following two questions: (Q1) Do tractable appearance models lead to faster and more stable learning, i.e. with smaller variance? (Q2) Does an explicit background model make Su PAIR more robust to noise than AIR? To this end, we implemented Su PAIR in Tensor Flow, making use of the RATSPN implementation by Peharz et al. (2018). We have also experimented with the SPN structure formulated by Poon & Domingos (2011) for the image domain, but have not found it to deliver significant improvements in learning speed or accuracy. We therefore report the results obtained using the more generally applicable random structures. All experiments were conducted using a single NVIDIA Ge Force GTX 1080 Ti and a AMD Ryzen Threadripper 1950X CPU. |
| Researcher Affiliation | Academia | 1CS Dept., TU Darmstadt, Darmstadt, Germany 2Eng. Dept. (CBL), University of Cambridge, UK 3Centre for Cognitive Science, TU Darmstadt, Darmstadt, Germany. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available online.1 |
| Open Datasets | Yes | We conducted experiments on two standard benchmarks for AIR, each with a different set of objects: Multi-MNIST, using MNIST-digits as objects, and Sprites, a dataset using artificially generated geometric shapes. |
| Dataset Splits | No | The paper only states "20% of each dataset was retained as a test set" but does not explicitly provide information about a separate validation split or the full train/validation/test percentages. |
| Hardware Specification | Yes | All experiments were conducted using a single NVIDIA Ge Force GTX 1080 Ti and a AMD Ryzen Threadripper 1950X CPU. |
| Software Dependencies | No | The paper mentions implementing in Tensor Flow and using Pyro, but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | No | The paper describes some architectural choices and inductive biases, such as making the background-SPN shallower and narrower with a lower limit for the variance of its Gaussian leaf nodes, and the structure of the RNN. However, it does not provide specific hyperparameter values like learning rate, batch size, number of epochs, or optimizer settings. |