Fat–Tailed Variational Inference with Anisotropic Tail Adaptive Flows

Authors: Feynman Liang, Michael Mahoney, Liam Hodgkinson

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on both synthetic and real-world targets confirm that ATAF is competitive with prior work while also exhibiting appropriate tail-anisotropy.
Researcher Affiliation Collaboration 1Department of Statistics, University of California, Berkeley, CA 2Meta, Menlo Park, CA 3International Computer Science Institute, Berkeley, CA.
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes we have open-sourced code for reproducing experiments in Supplementary Materials.
Open Datasets Yes diamonds dataset (Wickham, 2011) included in posteriordb (The Stan Developers, 2021). ... The eight-schools model (Rubin, 1981; Gelman et al., 2013). ... benchmark datasets from financial (daily log returns for five industry indices during 1926 2021 (Fama & French, 2015)) and actuarial (per-patient inpatient and outpatient cumulative Medicare/Medicid (CMS) claims during 2008 2010 (Centers for Medicare and Medicaid Services, 2010)) applications
Dataset Splits No The paper mentions running trials with a certain number of descent steps and samples for ELBO estimation, and uses 'golden samples' for marginal likelihoods, but it does not specify explicit train/validation/test splits for the datasets themselves.
Hardware Specification Yes All experiments were performed on an Intel i8700K with 32GB RAM and a NVIDIA GTX 1080 running Py Torch 1.9.0 / Python 3.8.5 / CUDA 11.2 / Ubuntu Linux 20.04 via Windows Subsystem for Linux.
Software Dependencies Yes All experiments were performed on an Intel i8700K with 32GB RAM and a NVIDIA GTX 1080 running Py Torch 1.9.0 / Python 3.8.5 / CUDA 11.2 / Ubuntu Linux 20.04 via Windows Subsystem for Linux.
Experiment Setup Yes For all flow-transforms ΦFlow, we used inverse autoregressive flows (Kingma et al., 2016) with a dense autoregressive conditioner consisting of two layers of either 32 or 256 hidden units depending on problem (see code for details) and ELU activation functions. ... Models were trained using the Adam optimizer with 10 3 learning rate for 10000 iterations