Neural Approximate Sufficient Statistics for Implicit Models
Authors: Yanzhi Chen, Dinghuai Zhang, Michael U. Gutmann, Aaron Courville, Zhanxing Zhu
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply our approach to both traditional approximate Bayesian computation and recent neural likelihood methods, boosting their performance on a range of tasks. 5 EXPERIMENTS Baselines. We apply the proposed statistics to two aforementioned likelihood-free inference methods: (i) SMC-ABC (Beaumont et al., 2009) and (ii) SNL (Papamakarios et al., 2019). We compare the performance of the algorithms augmented with our neural statistics (dubbed as SMC-ABC+ and SNL+) to their original versions as well as the versions based on expert-designed statistics (details presented later; we call the corresponding methods SMC-ABC and SNL ). We also compare to the sequential neural posterior estimate (SNPE) method1 which needs no statistic design, as well as the sequential ratio estimate (SRE) method (Hermans et al., 2020) which is closely related to our MI-based method2. All methods are run for 10 rounds with 1,000 simulations each. |
| Researcher Affiliation | Academia | Yanzhi Chen1 , Dinghuai Zhang2 , Michael U. Gutmann1, Aaron Courville2, Zhanxing Zhu3 1The University of Edinburgh, 2MILA, 3Beijing Institute of Big Data Research |
| Pseudocode | Yes | Algorithm 1 SMC-ABC+... Algorithm 2 SNL+ |
| Open Source Code | Yes | Codes available at: https://github.com/cyz-ai/neural-approx-ss-lfi. |
| Open Datasets | No | The paper describes generating data from models (Ising, Gaussian copula, OU process) for experiments rather than using or providing access to pre-existing publicly available datasets. |
| Dataset Splits | Yes | We take 20% of the data for validation, and stop training if the validation error does not improve after 100 epochs. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'Adam' for optimization and 'Masked Autoregressive Flow (MAF)' as a density estimator, but does not provide specific version numbers for software libraries or dependencies (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For the statistic network S in our method (for both JSD and DC estimators), we adopt a D-100-100-d fully-connected architecture with D being the dimensionality of input data and d the dimensionality of the statistic. For the network H used to extract the representation of θ, we adopt a K-100-100-K fully-connected architecture with K being the dimensionality of the model parameters θ. For the critic network, we adopt a (d + K)-100-1 fully connected architecture. Re LU is adopted as the non-linearity in all networks. All these neural networks are trained with Adam (Kingma & Ba, 2014) with a learning rate of 1 10 4 and a batch size of 200. No weight decay is applied. We take 20% of the data for validation, and stop training if the validation error does not improve after 100 epochs. ... For the neural density estimator in SNL/SNPE, which is realized by a Masked Autoregressive Flow (MAF) (Papamakarios et al., 2017), we adopt 5 autoregressive layers, each of which has two hidden layers with 50 tanh units. This is the same settings as in SNL. The MAF is trained with Adam with a learning rate of 5 10 4 and a batch size of 500 and a slight weight decay (1 10 4). |