reproducibilityindex.ai

Active preference learning for ordering items in- and out-of-sample

Authors: Herman Bergström, Emil Carlsson, Devdatt Dubhashi, Fredrik D. Johansson

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate GURO (Algorithm 1) and GURO Hybrid (see Section 5.1) in four image ordering tasks, one with logistic (synthetic) preference feedback, and three tasks based on real-world feedback from human annotators. [...] Our results demonstrate superior sample efficiency and generalization compared to non-contextual ranking approaches and active preference learning baselines.
Researcher Affiliation	Collaboration	Herman Bergström Chalmers University of Technology and University of Gothenburg hermanb@chalmers.se Emil Carlsson Sleep Cycle AB Chalmers University of Technology and University of Gothenburg Devdatt Dubhashi Chalmers University of Technology and University of Gothenburg Fredrik D. Johansson Chalmers University of Technology and University of Gothenburg
Pseudocode	Yes	Algorithm 1 Greedy Uncertainty Reduction for Ordering (GURO), [Bayes GURO] [...] Algorithm 2 Uniform sampling algorithm [...] Algorithm 3 BALD bandit
Open Source Code	Yes	Our code is available at: https://github.com/Healthy-AI/GURO
Open Datasets	Yes	Image Clarity Data available at https://dbgroup.cs.tsinghua.edu.cn/ligl/crowdtopk. [...] Wisc Adds Data available at https://dataverse.harvard.edu/dataset.xhtml?persistent Id= doi:10.7910/DVN/0ZRGEE (license: CC0 1.0). [...] IMDB-WIKI-Sb S Data available at https://github.com/Toloka/IMDB-WIKI-Sb S (license: CC BY). [...] X-ray Age Prediction Challenge (Felipe Kitamura, 2023)
Dataset Splits	Yes	Next, we split these into two sets, with one (ID) containing the youngest 50% and the other (IE) the oldest 50%. [...] For every seed, 10% of comparisons were used for the holdout set.
Hardware Specification	Yes	The longest trajectory (single seed) for any algorithm took less than 35hrs to complete on one core of an Intel Xeon Gold 6130 CPU and required at most 10 GB of memory.
Software Dependencies	No	GURO, Co LSTIM, and Uniform use Logistic Regression from Scikit-learn (Pedregosa et al., 2011) with default Ridge regularization (C = 1) and the lbfgs optimizer.
Experiment Setup	Yes	For Bayes GURO and BALD, the posterior p(θ \| Dt) is estimated using the Laplace approximation as described in Bishop and Nasrabadi (2006, Chapter 4). [...] For both methods, the priors θB,0 = 0d and H 1 B,0 = Id were used, and sequential updates were performed every iteration. [...] For Bayes GURO, 50 posterior samples were used to estimate ˆVθ\|Dt[σ(θT zij)] for every zij. [...] GURO, Co LSTIM, and Uniform use Logistic Regression from Scikit-learn (Pedregosa et al., 2011) with default Ridge regularization (C = 1) and the lbfgs optimizer.