Automatic variational inference with cascading flows

Authors: Luca Ambrogioni, Gianluigi Silvestri, Marcel van Gerven

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the performance of the new variational programs in a series of structured inference problems. We find that cascading flows have much higher performance than both normalizing flows and ASVI in a large set of structured inference problems.
Researcher Affiliation Academia 1 Donders Centre for Cognition, Radboud University, Netherlands 2 One Planet Research Center, imec-the Netherlands, Wageningen, Netherlands.
Pseudocode Yes The code for the spacial case without amortization and backward auxiliary coupling is shown in Figure 5.
Open Source Code No We provide an open-source implementation of this algorithm in Tensor Flow Probability (Dillon et al., 2017). The code for the spacial case without amortization and backward auxiliary coupling is shown in Figure 5.
Open Datasets No Ground-truth multivariate timeseries x = ( x1, . . . , x T ) were sampled from the generative model together with simulated first-half observations y1:T/2 = (y1, . . . , y T/2) and second-half observations y T/2:T = (y T/2+1, . . . , y T ).
Dataset Splits No All the variational models were trained conditioned only on the first half observations. Performance was assessed using two metrics. The first metric is the average marginal log-probability of the ground-truth given the variational posterior... Our second metric is log p(y T/2:T | y1:T/2): the predictive log-probability of the ground-truth observations in the second half of the timeseries given the observations in the first half...
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies No The paper mentions using “Tensor Flow Probability” but does not specify a version number or other software dependencies with version details.
Experiment Setup Yes In all experiments, the CF architectures were comprised of three highway flow blocks with softplus activation functions in each block except for the last which had linear activations. ... Weights and biases were initialized from centered normal distributions with scale 0.01. The λ variable was defined independently for each network as the logistic sigmoid of a learnable parameter l, which was initialized as 4 in order to keep the variational program close to the input program. ... In each repetition, all the variational programs were re-trained for 8000 iterations (enough to ensure convergence in all methods)...