Marginal Tail-Adaptive Normalizing Flows

Authors: Mike Laszkiewicz, Johannes Lederer, Asja Fischer

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental An empirical analysis shows that the proposed method improves on the accuracy especially on the tails of the distribution and is able to generate heavytailed data. We demonstrate its application on a weather and climate example, in which capturing the tail behavior is essential.
Researcher Affiliation Academia 1Faculty of Mathematics, Ruhr University, Bochum, Germany 2Center of Computer Science, Bochum, Germany.
Pseudocode Yes Algorithm 1 Marginal Tail Estimation
Open Source Code Yes We provide a Py Torch implementation and the code for all experiments, which can be accessed through our public git repository8.
Open Datasets Yes We apply m TAF to generate new data following the distribution of the EUMETSAT Numerical Weather Prediction Satellite Application Facility (NWP-SAF) dataset (Eresmaa & Mc-Nally, 2014).
Dataset Splits Yes To construct the training, test, and validation sets 15.000, 75 000, and 10 000 samples from this distribution are sampled, respectively. [...] Training, validation, and test sets consists of 50 000, 10 000, and 75 000 samples, respectively.
Hardware Specification No The paper does not specify any particular hardware (e.g., CPU, GPU models, memory) used for the experiments.
Software Dependencies No The paper mentions 'Py Torch implementation' but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes We optimized the network using Adam with 5 000 or 20 000 train steps in the case D = 8 and D = 50, respectively, with a learning rate of 1e 5 and a weight decay of 1e 6. To fit the Gaussian copula baseline, we use the default settings of the copulas (Patki et al., 2016) library. [...] In the NSF layers, we used conditioner Res Nets with 2 hidden layers with 30 or 200 hidden neurons in the case D = 8 and D = 50, respectively and Re LU activations. Further, we used NSF layers with 3 bins and set the tail-bound to 2. [...] The conditioner networks in the NSF-layers have 2 hidden layers with 100 hidden neurons in each layer, we set the tail-bounds to 2.5, and each spline uses 3 bins. We apply Batch-Norm after each NSF-layer. We optimize for 20 000 steps using the Adam optimizer with a learning rate of 1e-4 and a learning rate of 0.01 for the tail indices and scheduled the rates using cosine annealing.