Neural Expectation Maximization
Authors: Klaus Greff, Sjoerd van Steenkiste, Jürgen Schmidhuber
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on a perceptual grouping task for generated static images and video. By composing images out of simple shapes we have control over the statistical structure of the data, as well as access to the ground-truth clustering. This allows us to verify that the proposed method indeed recovers the intended grouping and learns representations corresponding to these objects. In particular we are interested in studying the role of next-step prediction as a unsupervised objective for perceptual grouping, the effect of the hyperparameter K, and the usefulness of the learned representations. In all experiments we train the networks using ADAM [19] with default parameters, a batch size of 64 and 50 000 train + 10 000 validation + 10 000 test inputs. Consistent with earlier work [8, 7], we evaluate the quality of the learned groupings with respect to the ground truth while ignoring the background and overlap regions. This comparison is done using the Adjusted Mutual Information (AMI; [35]) score, which provides a measure of clustering similarity between 0 (random) and 1 (perfect match). |
| Researcher Affiliation | Academia | Klaus Greff IDSIA klaus@idsia.ch Sjoerd van Steenkiste IDSIA sjoerd@idsia.ch Jürgen Schmidhuber IDSIA juergen@idsia.ch |
| Pseudocode | No | The paper describes the algorithms (N-EM, RNN-EM) in text and using figures (Figure 1 and Figure 2 illustrate the computational graphs), but it does not include a structured pseudocode block or an algorithm block labeled 'Algorithm' or 'Pseudocode'. |
| Open Source Code | Yes | Code to reproduce all experiments is available at https://github.com/sjoerdvansteenkiste/ Neural-EM |
| Open Datasets | Yes | We evaluate our approach on a perceptual grouping task for generated static images and video. By composing images out of simple shapes... We consider a sequential extension of MNIST. Here each sequence consists of gray-scale 24 24 images containing two down-sampled MNIST digits... |
| Dataset Splits | Yes | In all experiments we train the networks using ADAM [19] with default parameters, a batch size of 64 and 50 000 train + 10 000 validation + 10 000 test inputs. |
| Hardware Specification | Yes | We are grateful to NVIDIA Corporation for donating us a DGX-1 as part of the Pioneers of AI Research award, and to IBM for donating a Minsky machine. |
| Software Dependencies | No | The paper mentions using ADAM [19] for training but does not provide specific version numbers for any software dependencies, such as programming languages, deep learning frameworks, or other libraries. |
| Experiment Setup | Yes | In all experiments we train the networks using ADAM [19] with default parameters, a batch size of 64 and 50 000 train + 10 000 validation + 10 000 test inputs. Both networks are trained with K = 3 and unrolled for 15 EM steps. recurrent neural network of 100 sigmoidal units... x(t) is the current frame corrupted with additional bitflip noise (p = 0.2). Gaussian distribution for each pixel with fixed σ2 = 0.25 and µ = ψi,k... masked uniform noise: we first sample a binary mask from a multi-variate Bernoulli distribution with p = 0.2 and then use this mask to interpolate between the original image and samples from a Uniform distribution between the minimum and maximum values of the data (0,1). We train with K = 2 and T = 20 on flying MNIST. |