Active preference learning for ordering items in- and out-of-sample
Authors: Herman Bergström, Emil Carlsson, Devdatt Dubhashi, Fredrik D. Johansson
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate GURO (Algorithm 1) and GURO Hybrid (see Section 5.1) in four image ordering tasks, one with logistic (synthetic) preference feedback, and three tasks based on real-world feedback from human annotators. [...] Our results demonstrate superior sample efficiency and generalization compared to non-contextual ranking approaches and active preference learning baselines. |
| Researcher Affiliation | Collaboration | Herman Bergström Chalmers University of Technology and University of Gothenburg hermanb@chalmers.se Emil Carlsson Sleep Cycle AB Chalmers University of Technology and University of Gothenburg Devdatt Dubhashi Chalmers University of Technology and University of Gothenburg Fredrik D. Johansson Chalmers University of Technology and University of Gothenburg |
| Pseudocode | Yes | Algorithm 1 Greedy Uncertainty Reduction for Ordering (GURO), [Bayes GURO] [...] Algorithm 2 Uniform sampling algorithm [...] Algorithm 3 BALD bandit |
| Open Source Code | Yes | Our code is available at: https://github.com/Healthy-AI/GURO |
| Open Datasets | Yes | Image Clarity Data available at https://dbgroup.cs.tsinghua.edu.cn/ligl/crowdtopk. [...] Wisc Adds Data available at https://dataverse.harvard.edu/dataset.xhtml?persistent Id= doi:10.7910/DVN/0ZRGEE (license: CC0 1.0). [...] IMDB-WIKI-Sb S Data available at https://github.com/Toloka/IMDB-WIKI-Sb S (license: CC BY). [...] X-ray Age Prediction Challenge (Felipe Kitamura, 2023) |
| Dataset Splits | Yes | Next, we split these into two sets, with one (ID) containing the youngest 50% and the other (IE) the oldest 50%. [...] For every seed, 10% of comparisons were used for the holdout set. |
| Hardware Specification | Yes | The longest trajectory (single seed) for any algorithm took less than 35hrs to complete on one core of an Intel Xeon Gold 6130 CPU and required at most 10 GB of memory. |
| Software Dependencies | No | GURO, Co LSTIM, and Uniform use Logistic Regression from Scikit-learn (Pedregosa et al., 2011) with default Ridge regularization (C = 1) and the lbfgs optimizer. |
| Experiment Setup | Yes | For Bayes GURO and BALD, the posterior p(θ | Dt) is estimated using the Laplace approximation as described in Bishop and Nasrabadi (2006, Chapter 4). [...] For both methods, the priors θB,0 = 0d and H 1 B,0 = Id were used, and sequential updates were performed every iteration. [...] For Bayes GURO, 50 posterior samples were used to estimate ˆVθ|Dt[σ(θT zij)] for every zij. [...] GURO, Co LSTIM, and Uniform use Logistic Regression from Scikit-learn (Pedregosa et al., 2011) with default Ridge regularization (C = 1) and the lbfgs optimizer. |