Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

AI-Assisted Scientific Data Collection with Iterative Human Feedback

Authors: Travis Mandel, James Boyd, Sebastian J. Carter, Randall H. Tanaka, Taishi Nammoto5957-5966

AAAI 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We examine the empirical performance of TESA, ๏ฌnding that it is superior to previously-proposed approaches in simulations based on real-world datasets, as well as in a human subject experiment.
Researcher Affiliation Academia Travis Mandel, James Boyd, Sebastian J. Carter, Randall H. Tanaka, Taishi Nammoto University of Hawai i at Hilo, Hilo, HI EMAIL
Pseudocode Yes Algorithm 1 Threshold Estimating Sampling Algorithm (TESA)
Open Source Code Yes 9Source code used to generate the graphs is available at https://datadrivengame.science/aaai21/.
Open Datasets Yes Economics We replayed data collected by the Hass Avocado Board, relating price to weekly organic Avocado sales from 2015-2018 (Kiggins 2018). Mental Health We replayed data from the Open Sourcing Mental Illness 2016 Mental Health in Tech Survey (OSMI 2016). Cognitive Psychology We replayed data from cognitive psychology experiments by Petzold et al. (2004).
Dataset Splits No The paper describes using 'replayed data' and sampling from it, but does not provide specific train/validation/test dataset splits or refer to any standard predefined splits.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models or memory specifications used for running experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes We used TESA with ฯตp = 0.1 (to match epsilon-greedy) and a = 1. We try various values of b0, which equates to an initial guess of the threshold nmax.