reproducibilityindex.ai

Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration

Authors: Andrea Zanette, Alessandro Lazaric, Mykel J. Kochenderfer, Emma Brunskill

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	There has been growing progress on theoretical analyses for provably efﬁcient learning in MDPs with linear function approximation... This works makes two contributions. It presents a statistically and computationally efﬁcient online PAC algorithm... Before presenting the main result is useful to deﬁne the average feature φ ,t = Ext φt(xt, t(xt)) encountered at timestep t upon following a certain policy . In addition, we need a way to measure how explorable the space is... Theorem 4.1.
Researcher Affiliation	Collaboration	Andrea Zanette Stanford University zanette@stanford.edu Alessandro Lazaric Facebook Artiﬁcial Intelligence Research lazaric@fb.com Mykel J. Kochenderfer Stanford University mykel@stanford.edu Emma Brunskill Stanford University ebrun@cs.stanford.edu
Pseudocode	Yes	Algorithm 1 Forward Reward Agnostic Navigation with Conﬁdence by Injecting Stochasticity (FRANCIS)
Open Source Code	No	The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets	No	The paper is theoretical and does not use a concrete dataset; therefore, no information about public availability of training data is provided.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments, so no information on dataset splits (train/validation/test) is provided.
Hardware Specification	No	The paper is theoretical and does not describe any experimental setup or hardware used.
Software Dependencies	No	The paper is theoretical and does not specify any software dependencies with version numbers.
Experiment Setup	No	The paper describes a theoretical algorithm and provides proofs, but does not detail an experimental setup with specific hyperparameters or training configurations.