On Contrastive Learning for Likelihood-free Inference

Authors: Conor Durkan, Iain Murray, George Papamakarios

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we compare SRE and SNPE-C experimentally, and discuss practical details of their implementation. For completeness we also consider Sequential Neural Likelihood (SNL, Papamakarios et al., 2019b), another sequential method for likelihood-free inference which fits a neural density estimator as a surrogate likelihood. We compare likelihood-free methods using a testbed of three simulators, namely the Nonlinear Gaussian simulator with tractable likelihood described by Papamakarios et al. (2019b), along with the Lotka Volterra predator-prey and M/G/1 queue models whose setups are detailed by Papamakarios & Murray (2016).
Researcher Affiliation Collaboration 1School of Informatics, University of Edinburgh, United Kingdom 2Deep Mind, London, United Kingdom. Correspondence to: Conor Durkan <conor.durkan@ed.ac.uk>.
Pseudocode Yes Algorithm 1 (Sequential) Contrastive Likelihood-free Inference
Open Source Code Yes All experiments are repeated across 10 random seeds using a single GPU, and code is available at https://github.com/conormdurkan/lfi.
Open Datasets No The paper uses data generated by simulators (Nonlinear Gaussian, Lotka Volterra, M/G/1 queue models) described in other cited papers. It does not provide direct access information (link, DOI, repository) for a pre-existing, publicly available dataset used in the experiments.
Dataset Splits Yes To prevent overfitting, we perform early-stopping based on a held-out validation set of ten percent of the training data aggregated so far, stopping training when validation performance does not improve over 20 epochs.
Hardware Specification No The paper states 'All experiments are repeated across 10 random seeds using a single GPU', but it does not specify the exact model or type of GPU, CPU, or any other hardware component used.
Software Dependencies No The paper mentions software components like MAF, MADEs, Adam optimizer, but it does not provide specific version numbers for any software libraries or dependencies, such as Python, PyTorch, or TensorFlow versions.
Experiment Setup Yes A single MCMC chain persists across rounds for each method, where we perform burn-in of 200 iterations whenever the target distribution changes, and retain every tenth accepted sample. In each round, the parameters of each method are fit using stochastic gradient descent with the Adam (Kingma & Ba, 2015) optimizer, a learning rate of 5e-4, and a minibatch size of 100. To prevent overfitting, we perform early-stopping based on a held-out validation set of ten percent of the training data aggregated so far, stopping training when validation performance does not improve over 20 epochs. For all tasks we acquire 1000 new simulations per round, running the Nonlinear Gaussian and M/G/1 tasks for 25 rounds, and Lotka Volterra for 20.