Reinforcement Learning with Non-Exponential Discounting

Authors: Matthias Schultheis, Constantin A. Rothkopf, Heinz Koeppl

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate the applicability of our proposed approach on two simulated problems.
Researcher Affiliation Academia Matthias Schultheis Centre for Cognitive Science Technische Universität Darmstadt matthias.schultheis@tu-darmstadt.de Constantin A. Rothkopf Centre for Cognitive Science Technische Universität Darmstadt constantin.rothkopf@tu-darmstadt.de Heinz Koeppl Centre for Cognitive Science Technische Universität Darmstadt heinz.koeppl@tu-darmstadt.de
Pseudocode Yes Algorithm 1: Computation of the optimal value function and policy for non-exp. discounting (...) Algorithm 2: Computation of the gradient of F w.r.t. θ for inferring the discount function
Open Source Code Yes All proposed methods were implemented in Python using the PyTorch framework [71] and are available online1. (Footnote 1: https://git.rwth-aachen.de/bcs/nonexp-rl)
Open Datasets No The paper describes generating its own simulated data for experiments and does not refer to or provide access information for a publicly available or open dataset. For example: 'For generating data, we randomly sampled starting states and determined the time points at which subjects would switch their action. Afterward, the determined time points were distorted by Gaussian noise.'
Dataset Splits No The paper does not explicitly provide training/validation/test dataset splits. It describes generating simulated data but not how it was partitioned for different phases of model development or evaluation.
Hardware Specification No The paper mentions 'high-performance computer Lichtenberg at the NHR Centers NHR4CES at TU Darmstadt' in the acknowledgments, but does not provide specific hardware details such as CPU/GPU models, memory, or processor types.
Software Dependencies No The paper states 'All proposed methods were implemented in Python using the PyTorch framework [71]', but it does not specify the version numbers for Python or PyTorch, which are required for a reproducible description.
Experiment Setup Yes The used hyperparameters are listed in Appendix F. (Appendix F lists specific values for Number of hidden layers, Number of units per layer, Learning rate, Optimizer, Activation function, Number of samples for collocation points, Batch size, and Number of epochs for both simulated problems).