reproducibilityindex.ai

Cognitive model priors for predicting human decisions

Authors: David D. Bourgin, Joshua C. Peterson, Daniel Reichman, Stuart J. Russell, Thomas L. Griffiths

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We find that fine-tuning these networks on small datasets of real human decisions results in unprecedented state-of-the-art improvements on two benchmark datasets. Using this new methodology we greatly improve stateof-the-art performance on two recent human choice prediction competition datasets (Erev et al., 2017; Plonsky et al., 2019). We introduce a new benchmark dataset for human choice prediction in machine learning that is an order of magnitude larger than any previous datasets, comprising more than 240,000 human judgments over 13,000 unique decision problems. Figure 3. A. Validation MSE (20% of full choices13k human dataset) as a function of training on proportions of the training set (80% of full choices13k human dataset), for sparse MLPs trained from a random initialization (blue), and with a cognitive model prior (red).
Researcher Affiliation	Academia	1University of California, Berkeley 2Princeton University.
Pseudocode	No	The paper describes the proposed method in detail within the main text, but it does not include any structured pseudocode blocks or algorithms.
Open Source Code	No	The paper does not explicitly state that the source code for its own methodology is publicly available, nor does it provide a direct link to a code repository for their contributions. It only references source code for a third-party baseline model (BEAST).
Open Datasets	Yes	In total, the new dataset contained 242,879 human judgments on 13,006 gamble selection problems, making it the largest public dataset of human risky choice behavior to date. We collected a dataset more than ten times the size of CPC18 using Amazon Mechanical Turk.
Dataset Splits	Yes	The training sets we varied were a proportion (from 0.01 to 1.0) of our full choices13k training set (80% of the overall dataset). The remaining 20% of choices13k was used as a constant validation set and the size did not vary. We repeated this process ten times.
Hardware Specification	No	The paper does not provide specific details regarding the hardware (e.g., GPU models, CPU types, memory) used to conduct the experiments.
Software Dependencies	No	The paper mentions using a 'multilayer perceptron (MLP)', 'SRe LU activation functions', 'layer-wise dropout', and an 'RMSProp optimizer', along with the 'SET algorithm'. However, it does not specify version numbers for these software components or the underlying libraries (e.g., TensorFlow, PyTorch, scikit-learn) that would be needed for replication.
Experiment Setup	Yes	We grid-searched approximately 20,000 hyperparameter settings and found the best multilayer perceptron (MLP) overall for estimating both variants of BEAST (as well as the other datasets/tasks in this paper) had three layers with 200, 275, and 100 units respectively, SRe LU activation functions, layer-wise dropout rates of 0.15, and an RMSProp optimizer with a 0.001 learning rate.