Liouville Flow Importance Sampler

Authors: Yifeng Tian, Nishant Panda, Yen Ting Lin

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of LFIS through its application to a range of benchmark problems, on many of which LFIS achieved state-of-the-art performance.
Researcher Affiliation Academia 1Information Sciences Group (CCS-3), Computational and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA. Correspondence to: Yifeng Tian <yifengtian@lanl.gov>, Yen Ting Lin <yentingl@lanl.gov>.
Pseudocode Yes Algorithm 1 provides a more detailed description of the implementation of LFIS.
Open Source Code Yes The code for LFIS and the results of numerical experiments have been deposited at https://github.com/lanl/ LFIS.
Open Datasets Yes Log Gaussian Cox Process (type-2): ... modeling the positions of Findland pine saplings (Møller et al., 1998). Latent space of Variational Autoencoder (type-2): In this experiment, we investigate sampling in the latent space of a pre-trained Variational Autoencoder (VAE) on the binary MNIST dataset.
Dataset Splits No The paper discusses training criteria and uses a large number of samples for estimation and smaller batches for gradient descent, but it does not specify explicit validation dataset splits (e.g., percentages or counts for a distinct validation set) or cross-validation setup for hyperparameter tuning.
Hardware Specification Yes All the experiments are performed using a single NVIDIA A100 GPU with 40GB of RAM.
Software Dependencies No The paper mentions "Py Torch" and "JAX" but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes For all the numerical experiments, we use the Adam optimizer with an initial learning rate of 5 × 10−3. We employ an optimizer schedule that will reduce the learning rate to 50% every 200 epochs without observing any improvement in the loss. At each discrete time step, we used a separate feed-forward NN with a similar structure as in Vargas et al. (2023a) and Zhang & Chen (2022) (two hidden layers, each of which has 64 nodes) to model the discrete-time velocity field. Except for the first (t = 0) NN, which was initialized randomly, we instantiated the NN at t = k/T using the weights of the trained NN at the previous time t = (k − 1)/T to amortize the training cost. We initialized the weights of the last layer of the NN to be zero, which was observed to expedite the training process empirically.