Training Normalizing Flows from Dependent Data

Authors: Matthias Kirchler, Christoph Lippert, Marius Kloft

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that respecting dependencies between observations can improve empirical results on both synthetic and real-world data, and leads to higher statistical power in a downstream application to genome-wide association studies. We then apply our proposed method to three high-impact real-world settings. In all experiments, we find that adjustment for dependencies can significantly improve the model fit of normalizing flows.
Researcher Affiliation Academia 1Hasso Plattner Institute for Digital Engineering, University of Potsdam, Germany 2University of Kaiserslautern-Landau, Germany 3Hasso Plattner Institute for Digital Health at the Icahn School of Medicine at Mount Sinai, New York.
Pseudocode No The paper includes a mathematical proposition and its proof in Appendix A, but no section or block labeled "Pseudocode" or "Algorithm" with structured steps.
Open Source Code Yes We release our code at https://github.com/mkirchler/dependent_data_flows.
Open Datasets Yes The Labeled Faces in the Wild (LFW, (Huang et al., 2008)) data set consists of facial images... The UK Biobank (UKB, (Bycroft et al., 2018)) provides rich phenotyping and genotyping... The Alzheimer s Disease Neuroimaging Initiative (ADNI, (Jack Jr et al., 2008)) is a longitudinal study...
Dataset Splits Yes We split data into train (70%), validation (15%), and test (15%) data temporally (non-randomly) to counteract information leakage. ...choose the best model for each setting based on early stopping and validation set performance.
Hardware Specification Yes All models were trained with a batch-size of 64 and learning rate and weight decay of 0.001 on a single A100 GPU.
Software Dependencies Yes We used the GEMMA software version 0.98.5 (Zhou & Stephens, 2012) with score tests (option -lmm 3) and centered relatedness matrix (option -gk 1).
Experiment Setup Yes We train all models for 100 epochs, perform a small hyperparameter sweep over learning rate (in {0.001, 0.003, 0.01, 0.03}) and weight decay (in {0.001, 0.01, 0.1})... ADNI models were trained for 200 and LFW models for 400 epochs. All models were trained with a batch-size of 64 and learning rate and weight decay of 0.001...