Deep Learning for Predicting Human Strategic Behavior
Authors: Jason S. Hartford, James R. Wright, Kevin Leyton-Brown
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 4 we evaluate our model s ability to predict distributions of play given normal form descriptions of games on a dataset of experimental data from a variety of experiments, and find that our feature-free deep learning model significantly exceeds the performance of the current state-of-the-art model, which has access to hand-tuned features based on expert knowledge [23].Experimental Setup We used a dataset combining observations from 9 human-subject experimental studies conducted by behavioral economists in which subjects were paid to select actions in normal-form games. Experimental Results Figure 4 (left) shows a performance comparison between a model built using our deep learning architecture with only a single action response layer (i.e. no iterative reasoning; details below) and the previous state of the art, quantal cognitive hierarchy (QCH) with hand-crafted features (shown as a blue line). |
| Researcher Affiliation | Academia | Jason Hartford, James R. Wright, Kevin Leyton-Brown Department of Computer Science University of British Columbia {jasonhar, jrwright, kevinlb}@cs.ubc.ca |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | No | Experimental Setup We used a dataset combining observations from 9 human-subject experimental studies conducted by behavioral economists in which subjects were paid to select actions in normal-form games. Their payment depended on the subject s actions and the actions of their unseen opposition who chose an action simultaneously (see Section 1 of the supplementary material for further details on the experiments and data). |
| Dataset Splits | Yes | The error bars represent 95% confidence intervals across 10 rounds of 10-fold cross-validation. They were all trained until there was no training set improvement up to a maximum of 25 000 epochs and the parameters from the iteration with the best training set performance was returned. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | All the models presented in the experimental section were optimized using Adam [8] with an initial learning rate of 0.0002, β1 = 0.9, β2 = 0.999 and ϵ = 10 8. The paper mentions the optimizer Adam but does not specify software names with version numbers for other key dependencies (e.g., programming languages, libraries, frameworks). |
| Experiment Setup | Yes | All the models presented in the experimental section were optimized using Adam [8] with an initial learning rate of 0.0002, β1 = 0.9, β2 = 0.999 and ϵ = 10 8. The models were all regularized using Dropout with drop probability = 0.2 and L1 regularization with parameter = 0.01. They were all trained until there was no training set improvement up to a maximum of 25 000 epochs and the parameters from the iteration with the best training set performance was returned. |