Optimal Preconditioning and Fisher Adaptive Langevin Sampling

Authors: Michalis Titsias

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In several experiments we show that the proposed algorithm significantly outperforms all other methods. We compute effective sample size (ESS) scores for each method by using the 2 104 samples from the collection phase. We estimate ESS across each dimension of the state vector x, and we report maximum, median and minimum values, by using the built-in method in Tensor Flow Probability Python package. Also, we show visualizations that indicate sampling efficiency or effectiveness in estimating the preconditioner (when the ground truth preconditioner is known).
Researcher Affiliation Industry Michalis K. Titsias Google Deep Mind mtitsias@google.com
Pseudocode Yes Algorithm 1 Fisher adaptive MALA (blue lines are ommitted when not adapting (R, σ2))
Open Source Code No The paper does not provide an explicit statement or a link indicating the availability of source code for the described methodology.
Open Datasets Yes We consider three examples of multivariate Gaussian targets of the form π(x) = N(x|µ, Σ), where the optimal preconditioner (up to any positive scaling) is the covariance matrix Σ since the inverse Fisher is I 1 = Σ. We consider six binary classification datasets (Australian Credit, Heart, Pima Indian, Ripley, German Credit and Caravan) with a number of data ranging from n = 250 to n = 5822 and dimensionality of the θ ranging from 3 to 87. We also consider a much higher 785-dimensional example on MNIST for classifying the digits 5 and 6, that has 11339 training examples.
Dataset Splits No The paper describes 'burn-in iterations' and 'collecting samples' phases for the MCMC chain, but it does not specify explicit training/validation/test dataset splits for the data used in the models (e.g., percentages or sample counts for different data subsets).
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions 'Tensor Flow Probability Python package' but does not specify a version number. No other specific software components with version numbers are provided.
Experiment Setup Yes For all experiments and samplers we consider 2 104 burn-in iterations and 2 104 iterations for collecting samples. We set λ = 10 in Fisher MALA and Ada MALA. Adaptation of the proposal distributions, i.e. the parameter σ2, the preconditioning or the step size of HMC, occurs only during burn-in and at collection of samples stage the proposal parameters are kept fixed. For all three MALA schemes the global step size σ2 is adapted to achieve an acceptance rate around 0.574 (see Algorithm 1) while the corresponding parameter for HMC is adapted towards 0.651 rate [9].