Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Collective Intelligence in Decision-Making with Non-Stationary Experts

Authors: Axel Abels, Vito Trianni, Ann Nowé, Tom Lenaerts

JAIR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct an extensive empirical evaluation of our novel method and compare it to a range of baselines in Section 6.1. Through this evaluation we demonstrate that our proposed method provides a significant improvement in performance over previous adaptive algorithms for a wide variety of configurations. We further show that unlike previous algorithms, which require their adaptiveness to be tuned to match changes in expertise, our novel approach is more robust in terms of its adaptiveness parameter. We conclude by applying our methods to active learning in Section 6.2, demonstrating improved performance on a concrete real-world problem.
Researcher Affiliation	Academia	AXEL ABELS , Machine Learning Group, Université Libre de Bruxelles, Belgium, AI Lab, Vrije Universiteit Brussel, Belgium, and FARI Institute, Université Libre de Bruxelles Vrije Universiteit Brussel, Belgium VITO TRIANNI, Institute of Cognitive Sciences and Technologies, National Research Council, Italy ANN NOWÉ, AI Lab, Vrije Universiteit Brussel, Belgium and FARI Institute, Université Libre de Bruxelles Vrije Universiteit Brussel, Belgium TOM LENAERTS, Machine Learning Group, Université Libre de Bruxelles, Belgium, AI Lab, Vrije Universiteit Brussel, Belgium, Center for Human-Compatible AI, UC Berkeley, USA, and FARI Institute, Université Libre de Bruxelles Vrije Universiteit Brussel, Belgium
Pseudocode	Yes	Algorithm 1 CORVAL
Open Source Code	Yes	Code to reproduce these results is available at https://github.com/axelabels/CDM_NONSTAT.
Open Datasets	Yes	Following previous works [30, 12, 23, 22], we evaluate performance on 20 data sets which have previously been used to evaluate active learning approaches, namely bank-marketing, calhousing, cod-rna, credit-g, diabetes, eeg-eye-state, electricity, ibn-sina, ijcnn1, kc2, kdd99_10perc, magic Telescope, mozilla4, musk, ozone-level-8hr, qsar-biodeg, steel-plates-fault, svmguide3, tic-tac-toe, and zebra.
Dataset Splits	Yes	For each data set, we set aside 1/3𝑟𝑑of the data points as test set and run the active learning set-up for 100 steps on the remaining data.
Hardware Specification	No	The resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation Flanders (FWO) and the Flemish Government.
Software Dependencies	No	Following Chu and Lin 2016, we train a Logistic Regression classifier [21].
Experiment Setup	Yes	We evaluate the performance of the chosen algorithms in terms of reward over 𝑇= 5000 steps averaged over 200 simulations4. For a given period (𝜏 {100, 500, 2500}), we generate non-stationary expertise by averaging 100 randomly sampled sine waves each with period ˆ𝜏 N (𝜏,𝜏/2). For each data set, we set aside 1/3𝑟𝑑of the data points as test set and run the active learning set-up for 100 steps on the remaining data.