Representational aspects of depth and conditioning in normalizing flows

Authors: Frederic Koehler, Viraj Mehta, Andrej Risteski

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also show that shallow affine coupling networks are universal approximators in Wasserstein distance if ill-conditioning is allowed, and experimentally investigate related phenomena involving padding. On the empirical side, we explore the effect that different types of padding has on the training on various synthetic datasets.
Researcher Affiliation Academia 1Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA 2Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA 3Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA.
Pseudocode No The paper contains mathematical proofs and descriptions of methods but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any specific links to source code or statements about its availability.
Open Datasets No The experiments use 'synthetic datasets' and 'synthetic training data' generated by the authors, but no access information (link, DOI, citation) is provided for a publicly available or open dataset.
Dataset Splits No The paper mentions 'training data' and 'test data' for its synthetic experiments but does not provide specific information about training, validation, or test dataset splits (percentages, sample counts, or citations to predefined splits).
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions the use of a 'Real NVP architecture' but does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes Figure 1 we show the performance of a simple Real NVP architecture trained via max-likelihood on a mixture of 4 Gaussians, as well as plot the condition number of the Jacobian during training for each padding method. Precisely, we generate (synthetic) training data of the form Az, where z N(0, I) for a fixed d d square matrix A with random standard Gaussian entries and train a linear affine coupling network with n = 1, 2, 4, 8, 16 layers by minimizing the loss Ez N(0,I) (fn(z) Az)2 .