Deep Learning for Predicting Human Strategic Behavior

Authors: Jason S. Hartford, James R. Wright, Kevin Leyton-Brown

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 4 we evaluate our model s ability to predict distributions of play given normal form descriptions of games on a dataset of experimental data from a variety of experiments, and find that our feature-free deep learning model significantly exceeds the performance of the current state-of-the-art model, which has access to hand-tuned features based on expert knowledge [23].Experimental Setup We used a dataset combining observations from 9 human-subject experimental studies conducted by behavioral economists in which subjects were paid to select actions in normal-form games. Experimental Results Figure 4 (left) shows a performance comparison between a model built using our deep learning architecture with only a single action response layer (i.e. no iterative reasoning; details below) and the previous state of the art, quantal cognitive hierarchy (QCH) with hand-crafted features (shown as a blue line).
Researcher Affiliation Academia Jason Hartford, James R. Wright, Kevin Leyton-Brown Department of Computer Science University of British Columbia {jasonhar, jrwright, kevinlb}@cs.ubc.ca
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets No Experimental Setup We used a dataset combining observations from 9 human-subject experimental studies conducted by behavioral economists in which subjects were paid to select actions in normal-form games. Their payment depended on the subject s actions and the actions of their unseen opposition who chose an action simultaneously (see Section 1 of the supplementary material for further details on the experiments and data).
Dataset Splits Yes The error bars represent 95% confidence intervals across 10 rounds of 10-fold cross-validation. They were all trained until there was no training set improvement up to a maximum of 25 000 epochs and the parameters from the iteration with the best training set performance was returned.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No All the models presented in the experimental section were optimized using Adam [8] with an initial learning rate of 0.0002, β1 = 0.9, β2 = 0.999 and ϵ = 10 8. The paper mentions the optimizer Adam but does not specify software names with version numbers for other key dependencies (e.g., programming languages, libraries, frameworks).
Experiment Setup Yes All the models presented in the experimental section were optimized using Adam [8] with an initial learning rate of 0.0002, β1 = 0.9, β2 = 0.999 and ϵ = 10 8. The models were all regularized using Dropout with drop probability = 0.2 and L1 regularization with parameter = 0.01. They were all trained until there was no training set improvement up to a maximum of 25 000 epochs and the parameters from the iteration with the best training set performance was returned.