reproducibilityindex.ai

Non-Stochastic Control with Bandit Feedback

Authors: Paula Gradu, John Hallman, Elad Hazan

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now provide empirical results of our algorithms performance on different dynamical systems and under various noise distributions. In all figures, we average the results obtained over 25 runs and include the corresponding confidence intervals.
Researcher Affiliation	Collaboration	Paula Gradu1,3 John Hallman1,3 Elad Hazan2,3 1 Department of Mathematics, Princeton University 2 Department of Computer Science, Princeton University 3 Google AI Princeton {pgradu,hallman,ehazan}@princeton.edu
Pseudocode	Yes	Algorithm 1 BCO with Memory; Algorithm 2 Bandit Perturbation Controller; Algorithm 3 System identiﬁcation via random inputs; Algorithm 4 BPC with system identiﬁcation
Open Source Code	Yes	Our algorithm implementation is available at [26]. [26] Google AI Princeton. Deluca. https://github.com/MinRegret/deluca, 2020.
Open Datasets	No	The paper uses linear dynamical systems defined by matrices A and B and various synthetic noise specifications (i.i.d Gaussian noise, Sinusoidal noise, Gaussian random walk). It does not provide access information for a publicly available or open dataset.
Dataset Splits	No	The paper mentions averaging results over 25 runs but does not specify training, validation, or test dataset splits, nor does it discuss cross-validation or other data partitioning methods.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions that 'Our algorithm implementation is available at [26]', but it does not specify any software names with version numbers (e.g., Python version, specific libraries like PyTorch or TensorFlow versions).
Experiment Setup	Yes	For both BPC and GPC we initialize K to be the infinite-horizon LQR solution given dynamics A and B in all of the settings below in order to observe the improvement provided by the two perturbation controllers relative to the classical approach.