Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

AI-Assisted Scientific Data Collection with Iterative Human Feedback

Authors: Travis Mandel, James Boyd, Sebastian J. Carter, Randall H. Tanaka, Taishi Nammoto5957-5966

AAAI 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We examine the empirical performance of TESA, ﬁnding that it is superior to previously-proposed approaches in simulations based on real-world datasets, as well as in a human subject experiment.
Researcher Affiliation	Academia	Travis Mandel, James Boyd, Sebastian J. Carter, Randall H. Tanaka, Taishi Nammoto University of Hawai i at Hilo, Hilo, HI EMAIL
Pseudocode	Yes	Algorithm 1 Threshold Estimating Sampling Algorithm (TESA)
Open Source Code	Yes	9Source code used to generate the graphs is available at https://datadrivengame.science/aaai21/.
Open Datasets	Yes	Economics We replayed data collected by the Hass Avocado Board, relating price to weekly organic Avocado sales from 2015-2018 (Kiggins 2018). Mental Health We replayed data from the Open Sourcing Mental Illness 2016 Mental Health in Tech Survey (OSMI 2016). Cognitive Psychology We replayed data from cognitive psychology experiments by Petzold et al. (2004).
Dataset Splits	No	The paper describes using 'replayed data' and sampling from it, but does not provide specific train/validation/test dataset splits or refer to any standard predefined splits.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models or memory specifications used for running experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	We used TESA with ϵp = 0.1 (to match epsilon-greedy) and a = 1. We try various values of b0, which equates to an initial guess of the threshold nmax.