reproducibilityindex.ai

Value Function in Frequency Domain and the Characteristic Value Iteration Algorithm

Authors: Amir-massoud Farahmand

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This paper considers the problem of estimating the distribution of returns in reinforcement learning, i.e., distributional RL problem. It presents a new representational framework to maintain the uncertainty of returns and provides mathematical tools to compute it. We show that instead of representing a probability distribution function of returns, one can represent their characteristic function, the Fourier transform of their distribution. ... We analyze CVI and its approximate variant and show how approximation errors affect the quality of the computed CVF. ... This paper is only the ﬁrst step towards understanding CVFs and their properties. ... Finally, empirically evaluating this approach for return uncertainty representation may lead to better understanding of its strengths and weaknesses.
Researcher Affiliation	Academia	Amir-massoud Farahmand Vector Institute & University of Toronto Toronto, Canada farahmand@vectorinstitute.ai
Pseudocode	No	The paper describes the Characteristic Value Iteration (CVI) and Approximate CVI (ACVI) procedures iteratively, but does not present them in a structured pseudocode or algorithm block.
Open Source Code	No	The paper does not include any statement or link indicating the provision of open-source code for the described methodology.
Open Datasets	No	The paper is theoretical and does not report on any experiments involving datasets, thus no information about public dataset access for training is provided.
Dataset Splits	No	The paper is theoretical and does not report on any experiments or dataset usage, therefore no information regarding training, validation, or test dataset splits is provided.
Hardware Specification	No	The paper is theoretical and does not describe any experimental setup or hardware used for running experiments.
Software Dependencies	No	The paper is theoretical and does not detail any specific software dependencies with version numbers needed to replicate its theoretical contributions.
Experiment Setup	No	The paper is theoretical and does not describe any experimental setup, including hyperparameter values or system-level training settings.