reproducibilityindex.ai

GATSBI: Generative Adversarial Training for Simulation-Based Inference

Authors: Poornima Ramesh, Jan-Matthis Lueckmann, Jan Boelts, Álvaro Tejero-Cantero, David S. Greenberg, Pedro J. Goncalves, Jakob H. Macke

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate GATSBI on two SBI benchmark problems and on two high-dimensional simulators. On a model for wave propagation on the surface of a shallow water body, we show that GATSBI can return well-calibrated posterior estimates even in high dimensions. On a model of camera optics, it infers a high-dimensional posterior given an implicit prior, and performs better than a state-of-the-art SBI approach.
Researcher Affiliation	Academia	Poornima Ramesh University of Tübingen Jan-Matthis Lueckmann University of Tübingen Jan Boelts TU Munich Álvaro Tejero-Cantero University of Tübingen David S. Greenberg Helmholtz Centre Hereon Pedro J. Gonçalves University of Tübingen Jakob H. Macke University of Tübingen
Pseudocode	Yes	B TRAINING ALGORITHMS Algorithm 1 GATSBI Input : prior π(θ), simulator p(x\|θ), generator fφ, discriminator Dψ, learning rate λ Output: Trained GAN networks fφ and Dψ
Open Source Code	Yes	Code implementing the method and experiments described in the manuscript is available at https://github.com/mackelab/ gatsbi.
Open Datasets	Yes	We chose the EMNIST dataset (Cohen et al., 2017) with 800k 28 28-dimensional images as the implicit prior.
Dataset Splits	Yes	For each simulation budget, 100 samples were held out for validation.
Hardware Specification	Yes	We ran the high-dimensional experiments (camera model and shallow water model) on Tesla-V100 GPUs: the shallow water model required training to be parallelised across 2 GPUs at a time, and took about 4 days to converge and about 1.5 days for the camera model on one Tesla V100. We used RTX-2080Tis for the benchmark problems: the amortised GATSBI runs lasted a maximum of 1.5 days for the 100k budget; the sequential GATSBI runs took longer with the maximum being 8 days for the energy-based correction with a budget of 100k.
Software Dependencies	No	The paper mentions software like PyTorch (Paszke et al., 2019), Weights and Biases (Biewald, 2020), Adam optimiser (Kingma and Ba, 2015), scipy fft2 package (Virtanen et al., 2020), Fortran (F90), and scikit-learn (Pedregosa et al., 2011), but does not provide specific version numbers for these software components.
Experiment Setup	Yes	The generator and discriminator were trained in parallel for 1k, 10k and 100k simulations, with a batch size = min(10% of the simulation budget, 1000). For each simulation budget, 100 samples were held out for validation. We used 10 discriminator updates for 1k and 10k simulation budgets, and 100 discriminator updates for the 100k simulation budget, per generator update. Note that the increase in discriminator updates for 100k simulations is intended to compensate for the reduced relative batch size i.e., 1000 batches = 0.01%. The networks were optimised with the cross-entropy loss. We used the Adam optimiser (Kingma and Ba, 2015) with learning rate=0.0001, β1=0.9 and β2=0.99 for both networks. We trained the networks for 10k, 20k and 20k epochs for the three simulation budgets respectively. To ensure stable training, we used spectral normalisation (Miyato et al., 2018) for the discriminator network weights.