reproducibilityindex.ai

Collective Intelligence in Human-AI Teams: A Bayesian Theory of Mind Approach

Authors: Samuel Westby, Christoph Riedl

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We use data collected from an online experiment in which 145 individuals in 29 human-only teams of five communicate through a chat-based system to solve a cognitive task. We find that humans (a) struggle to fully integrate information from teammates into their decisions, especially when communication load is high, and (b) have cognitive biases which lead them to underweight certain useful, but ambiguous, information. Our theory of mind ability measure predicts both individualand team-level performance.
Researcher Affiliation	Academia	1 Network Science Institute, Northeastern University, Boston, MA 2 Khoury College of Computer Sciences, Northeastern University, Boston, MA
Pseudocode	No	The paper contains mathematical equations (e.g., Equation 1, 2, 3) and describes the model's steps, but it does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	We use data collected from an IRB approved online experiment... (data and code available at https://github.com/riedlc/Human AITeams And CI).
Open Datasets	Yes	We use data from an IRB approved online experiment conducted on the Volunteer Science platform (Radford et al. 2016) in which 145 individuals in 29 human-only teams of five solved a Hidden Profile task (data and code available at https://github.com/riedlc/Human AITeams And CI).
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits (e.g., specific percentages or sample counts) for the human subject data. While model parameters are learned and evaluated, a formal dataset splitting strategy for reproducibility is not detailed.
Hardware Specification	No	The paper does not provide any specific hardware details such as CPU models, GPU models, or cloud computing specifications used for running the experiments or simulations.
Software Dependencies	No	The paper does not list specific software dependencies with their version numbers (e.g., Python version, specific library versions like PyTorch or TensorFlow).
Experiment Setup	Yes	The paper provides details on the experimental setup, including: "145 individuals in 29 human-only teams of five", "After a five minute discussion phase", "Subjects were recruited from Amazon Mechanical Turk... and paid a $0.75 flat fee for participation as well as a $0.25 performance bonus for each correct answer. The entire task took about seven minutes to complete.". It also describes the model's parameters and how they were derived: "four information weights SN, MN, MY, SY determining the likelihood distribution of observations under inferred beliefs, and the theory of mind ability αD which modulates the relative weighting of the self vs. partner beliefs... information weights (SN, MN, SY, MY) are learned from the data through a grid search...". Specific parameter values are shown in Table 1, for example, "MLE (0.1, 1, 1.45, 2)" for weights and "0.95" for alphaD.