Self-Similarity Priors: Neural Collages as Differentiable Fractal Representations

Authors: Michael Poli, Winnie Xu, Stefano Massaroli, Chenlin Meng, Kuno Kim, Stefano Ermon

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In data compression, Neural Collages preserve advantages of fractal compression methods, with up to 10 (accounting for training time) and 100 (at test time) speedups during encoding. Further, we investigate deep generative models based on Neural Collages, where Collage parameters assume the role of latent variables of a hierarchical variational autoencoder (VAE) (Kingma and Welling, 2013). Collage VAEs can sample at resolutions unseen during training by decoding at higher resolutions through their Collage operator, revealing additional detail over upsampling via interpolation. Finally, we showcase applications for fractal art, where an image can be "fractalized" i.e. reconstructed as a collage of smaller copies of itself at different scales (see Figure 1, top).
Researcher Affiliation Academia Michael Poli Stanford University Diffeq ML Winnie Xu University of Toronto Stefano Massaroli Mila Diffeq ML Chenlin Meng Stanford University Kuno Kim Stanford University Stefano Ermon Stanford University CZ Biohub
Pseudocode Yes Pseudocode for a single step of a Collage is shown below. Figure 2 provides a visualization of the convergence of a Collage to its fixed-point (after repeated application of the operator). def collage_operator(self, z, collage_weight, collage_bias): """Collage Operator (decoding). Performs the steps described in Def. 3.1, Figure 2.""" # Split the current iterate z into source patches according to the partitioning scheme. domains = img_to_patches(z) # Pool domains (pre augmentation) to range patch sizes. pooled_domains = pool(domains) # If needed, produce additional candidate source patches as augmentations of existing # domains, or concatenate auxiliary patches parametrized and optimized directly. if self.n_aug_transforms > 1: pooled_domains = self.generate_candidates(pooled_domains) pooled_domains = repeat(pooled_domains, 'b c d h w -> b c d r h w', r=self.num_ranges) # Apply the affine maps to source patches range_domains = einsum('bcdrhw, bcdr -> bcrhw', pooled_domains, collage_weight) range_domains = range_domains + collage_bias[..., None, None] # Reconstruct data by composing the output patches back together. z = patches_to_img(range_domains) return z
Open Source Code Yes The code is available at github.com/ermongroup/self-similarity-prior.
Open Datasets Yes To investigate the properties of Collage VAEs, including quality of magnified samples, we compare VDVAEs and Collage VAEs on dynamically binarized MNIST. We consider compressing images obtained from the DOTA large-scale aerial images dataset (Xia et al., 2018).
Dataset Splits No The paper mentions training on 80000 crops and evaluating on 10 held-out images for DOTA, and training with different β values for MNIST, but does not explicitly provide details for a separate validation dataset split.
Hardware Specification No The main text does not specify exact GPU/CPU models, memory amounts, or detailed computer specifications used for running experiments, only indicating "Yes" in the checklist for including total amount of compute and type of resources used, without providing the details in the visible text.
Software Dependencies No The paper mentions software like PyTorch (implied by Torchdyn and github repo name), but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes We train several Collage VAEs and VDVAEs with β ranging in [0.5, 0.7, 1.0, 1.2, 1.5], and report the rate-distortion curve in Figure 4 (right). We optimize convolutional encoder parameters θ and auxiliary sources on the reconstruction objective J(x, z (θ, u)) = Pm i=1(xi z i (θ, u))2 + w 2 2