reproducibilityindex.ai

Sample Complexity Bounds for Iterative Stochastic Policy Optimization

Authors: Marin Kobilarov

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The approach is illustrated with a simple robot control scenario and initial steps towards applications to challenging aerial vehicle navigation problems are presented. We next illustrate the application of these bounds using the simple scenario introduced in 3. Application to Aerial Vehicle Navigation
Researcher Affiliation	Academia	Marin Kobilarov Department of Mechanical Engineering Johns Hopkins University Baltimore, MD 21218 marin@jhu.edu
Pseudocode	Yes	Iterative Stochastic Policy Optimization (ISPO) 0. Start with initial hyper-parameters ν0 (i.e. a prior), set i = 0 1. Sample M trajectories (ξj, τj) p( \|νi) for j = 1, . . . , M 2. Compute new policy νi+1 using observed costs J(τj) 3. Compute bound on expected cost and Stop if below threshold, else set i=i+1 and Goto 1
Open Source Code	No	The paper does not include any explicit statements about releasing the source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets	No	The paper describes experiments in a 'simulated environment' and a 'campus-like environment' using an 'experimentally identified model', but it does not provide concrete access information (link, DOI, repository, or formal citation) for any publicly available or open dataset used for training.
Dataset Splits	No	The paper mentions sample sizes and iteration windows for computing bounds but does not provide specific train/validation/test dataset splits or cross-validation details as would be typical for machine learning experiments.
Hardware Specification	No	The paper mentions an 'Asc Tec quadrotor' as the subject of the experiment but provides no specific details about the computing hardware (e.g., GPU/CPU models, memory) used to run the simulations or analysis.
Software Dependencies	No	The paper mentions the use of a 'high-fidelity open-source physics engine' but does not specify its name or version, nor does it list any other software dependencies with version numbers.
Experiment Setup	Yes	We used a window of maximum L = 10 previous iterations to compute the bounds, i.e. to compute νi+1 all samples from densities νi L+1, νi L+2, . . . , νi were used. using M = 200 samples (Figure 1) at each iteration. At each iteration M = 200 samples are taken with 1 δ = 0.95 conﬁdence level. A window of L = 5 past iterations were used for the bounds.