ButterflyFlow: Building Invertible Layers with Butterfly Matrices

Authors: Chenlin Meng, Linqi Zhou, Kristy Choi, Tri Dao, Stefano Ermon

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we demonstrate that Butterfly Flows not only achieve strong density estimation results on natural images such as MNIST, CIFAR-10, and Image Net-32 32, but also obtain significantly better log-likelihoods on structured datasets such as galaxy images and MIMIC-III patient cohorts all while being more efficient in terms of memory and computation than relevant baselines.
Researcher Affiliation Academia 1Computer Science Department, Stanford University. Correspondence to: Chenlin Meng <chenlin@cs.stanford.edu>, Linqi Zhou <linqizhou@stanford.edu>, Kristy Choi <kechoi@cs.stanford.edu>, Stefano Ermon <ermon@cs.stanford.edu>.
Pseudocode No No pseudocode or clearly labeled algorithm blocks were found in the paper.
Open Source Code No The paper does not explicitly state that the source code for its methodology is publicly available, nor does it provide a direct link to a repository for its own implementation. It only references a GitHub page for data preprocessing used by a third-party.
Open Datasets Yes Datasets. As in prior works (Hoogeboom et al., 2019; Lu & Huang, 2020), we evaluate our method s performance on MNIST (Deng, 2012), CIFAR-10 (Krizhevsky et al., 2009), and Image Net-32 32 (Deng et al., 2009). MIMIC-III waveform database... More details about the dataset is available at https://physionet.org/content/mimic3wdb/1.0/.
Dataset Splits No The paper references test sets and uses standard datasets like MNIST and CIFAR-10 which typically have predefined splits, but it does not explicitly state the training, validation, and test dataset splits with percentages, sample counts, or specific split methodology for full reproducibility.
Hardware Specification Yes All testing is done on a TITAN XP GPU.
Software Dependencies No The paper mentions using 'Adam optimizer' and 'Py Torch' but does not provide specific version numbers for these or any other software dependencies, making it difficult to fully reproduce the software environment.
Experiment Setup Yes For all experiments, we use Adam optimizer with α = 0.001, β1 = 0.9, β2 = 0.999 for training. We warm up our learning by linearly increasing learning rate from 0 to initial learning rate for 10 iterations, and afterwards exponentially decaying with γ = 0.999997 per iteration.