Preferential Normalizing Flows
Authors: Petrus Mikkola, Luigi Acerbi, Arto Klami
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We introduce a method for eliciting the expert s belief density as a normalizing flow based solely on preferential questions such as comparing or ranking alternatives. This allows eliciting in principle arbitrarily flexible densities, but flow estimation is susceptible to the challenge of collapsing or diverging probability mass that makes it difficult in practice. We tackle this problem by introducing a novel functional prior for the flow, motivated by a decision-theoretic argument, and show empirically that the belief density can be inferred as the function-space maximum a posteriori estimate. We demonstrate our method by eliciting multivariate belief densities of simulated experts, including the prior belief of a general-purpose large language model over a real-world dataset. |
| Researcher Affiliation | Academia | Petrus Mikkola, Luigi Acerbi , Arto Klami Department of Computer Science, University of Helsinki first.last@helsinki.fi |
| Pseudocode | Yes | Algorithm 1 Full algorithm require: preferential data Dfull while not converged do sample mini-batch D Dfull ϕ ϕFS-Posterior(ϕ|D) end while; Algorithm 2 FS-Posterior(ϕ|D) require: precision s input: flow parameters ϕ, mini-batch D X = design matrix of D X = winner points of X loglik = P log L(D | fϕ(X), s) logprior = P fϕ(X ) return: loglik + logprior |
| Open Source Code | Yes | Code for reproducing all experiments is available at https://github.com/petrus-mikkola/prefflow. |
| Open Datasets | Yes | We first fit a flow model to the continuous covariates of the regression data abalone [Nash et al., 1995], and then use the fitted flow as a ground-truth belief density in the elicitation experiment. ... California housing dataset [Pace and Barry, 1997] |
| Dataset Splits | No | The paper describes how it generates data (synthetic or LLM queries) for learning the flow but does not specify explicit training, validation, or test dataset splits of this collected data in a conventional sense for model development. |
| Hardware Specification | Yes | Models are trained and evaluated on a server with nodes of two Intel Xeon processors, code name Cascade Lake, with 20 cores each running at 2.1 GHz. |
| Software Dependencies | No | The paper mentions using Real NVP, Neural Spline Flow, PyTorch (via normflows package), and Adamax optimizer, but it does not specify explicit version numbers for these software components. |
| Experiment Setup | Yes | In all the experiments, we use the value s = 1 in the preferential likelihood regardless of how misspecified it is with respect to the ground-truth model. Neural Spline Models have 2 hidden layers and 128 hidden units. The number of flows is 6, 8, or 10 depending on the problem complexity. Real NVP models have 4 hidden layers and 2 hidden units. The number of flows is 36 when the number of rankings is more than 50, and 8 otherwise. Models are trained for a varying number of iterations from 10^5 to 5 * 10^5 with the Adamax optimizer [Kingma and Ba, 2014] and varying batch size from 2 to 8. The learning rate varies from 10^5 to 5 * 10^5 depending on the problem dimensionality, with higher learning rates for higher-dimensional problems. A small weight decay of 10^6 was applied. |