Apparently Irrational Choice as Optimal Sequential Decision Making

Authors: Haiyang Chen, Hyung Jin Chang, Andrew Howes792-800

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Reported results are averaged over 10 runs, each with a different seed, after training on 3 million samples. In order to test the model, we designed three different agents: The integrated agent could use both calculation and comparison selectively.
Researcher Affiliation Academia 1 School of Computer Science, University of Birmingham 2 Department of Communications and Networking, Aalto University {hxc797, h.j.chang, howesa}@bham.ac.uk
Pseudocode No The paper describes the model formulation and training process, but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper states 'For all reported experiments, we built the environments within Open AI Gym (Brockman et al. 2016) and used the Baselines1 implementation of the deep RL algorithms.' and links to the OpenAI Baselines GitHub ('1https://github.com/openai/baselines'), but this is a third-party library used by the authors, not their own source code for the methodology described in the paper.
Open Datasets No The paper states: 'We assumed that probabilities p were sampled from a β distribution and values v were sampled from a t distribution. These distributions represented the ecological distributions experienced by participants in the human behaviour experiments reported by (Wedell 1991).', which describes how their experimental environment was generated based on prior work, but does not provide access information for a publicly available dataset.
Dataset Splits No The paper mentions 'training on 3 million samples' and describes how the environment's distributions were fitted to prior human experiments, but it does not specify explicit training, validation, or test dataset splits.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments.
Software Dependencies No The paper states 'we built the environments within Open AI Gym (Brockman et al. 2016) and used the Baselines1 implementation of the deep RL algorithms.' but does not specify version numbers for Open AI Gym or Baselines.
Experiment Setup Yes The fitted parameter values were: calculation noise σcalc = 4, comparison error P(errorf) = 0.1, probability weighting parameters α = 1, the perceived cost of comparison Ccomparison = 0.01 and the calculation cost Ccalc = 0.1.