Variational Neural Cellular Automata
Authors: Rasmus Berg Palm, Miguel González Duque, Shyam Sudhakaran, Sebastian Risi
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We find that the VNCA learns to reconstruct samples well and that despite its relatively few parameters and simple local-only communication, the VNCA can learn to generate a large variety of output from information encoded in a common vector format. While there is a significant gap to the current state-of-the-art in terms of generative modeling performance, we show that the VNCA can learn a purely self-organizing generative process of data. Additionally, we show that the VNCA can learn a distribution of stable attractors that can recover from significant damage. |
| Researcher Affiliation | Academia | Creative AI Lab IT University of Copenhagen Copenhagen, Denmark {1rasmb,2migd,4sebr}@itu.dk, 3shyamsnair@protonmail.com |
| Pseudocode | Yes | A.3 POOL AND DAMAGE TRAINING 1 import random 2 3 def pool_dmg_train(p_z_0, data, batch_size, pool_size, optim): 4 pool = [] 5 while not converged(): 6 x = random.sample(data, batch_size) 7 n_pool_samples = batch_size // 2 8 z_pool = None 9 if len(pool) > n_pool_samples: # Sample from the pool 10 x_pool, z_pool = pool[:n_pool_samples] 11 x[n_pool_samples:] = x_pool 12 z_pool = damage_half(z_pool) 13 q_z_0, z_0 = encode(x) 14 if z_pool: 15 z_0[n_pool_samples:] = z_pool 16 p_x_given_z_T, z_T = NCA(z_0) # Run the NCA 17 L = ELBO(p_x_given_z_T, x, q_z_0, p_z_0) 18 L.backwards() 19 optim.step() 20 # Add new states to pool 21 pool.append((x, z_T)) 22 random.shuffle(pool) 23 pool = pool[:pool_size] |
| Open Source Code | Yes | The code to reproduce every experiment is available at github.com/rasmusbergpalm/vnca. |
| Open Datasets | Yes | For our first experiment we chose MNIST, since it is widely used in the generative modeling literature and a relatively easy dataset. We use the statically binarized MNIST from Larochelle & Murray (2011). and The Celeb A dataset contains 202,599 images of celebrity faces (Liu et al., 2015) and has been extensively used in the generative modeling literature. and The Noto Emoji font contains 2656 vector graphic emojis2. This dataset allows for comparison with previous NCA auto-encoder work, which uses it (Frans, 2021; Chen & Wang, 2020; Ruiz et al., 2020; Mordvintsev et al., 2020). https://github.com/googlefonts/noto-emoji |
| Dataset Splits | Yes | Architectures and hyper-parameters were heuristically explored and chosen based on their performance on the validation sets and the memory limits of our GPUs. and We use a single sample to compute the ELBO when training and measure final log-likelihoods on the test set using 128 importance weighted samples and We train a VNCA on 64 64 emoji examples using K = 5 doublings and using 20% (531) of the images as a test set. |
| Hardware Specification | No | Architectures and hyper-parameters were heuristically explored and chosen based on their performance on the validation sets and the memory limits of our GPUs. This only mentions "GPUs" generally, without specific models or configurations. |
| Software Dependencies | No | The paper describes model architectures using PyTorch's module format in the appendix (e.g., A.1, A.5, A.6), but does not specify version numbers for PyTorch, Python, or other software dependencies. |
| Experiment Setup | Yes | Except where otherwise noted, we use a batch size of 32, Adam optimizer (Kingma & Ba, 2014), 10 4 learning rate, L = 1 logistic mixture component, clip the gradient norm to 10 (Pascanu et al., 2013) and train the VNCA for 100.000 gradient updates. |