reproducibilityindex.ai

Optimizing Energy Production Using Policy Search and Predictive State Representations

Authors: Yuri Grinberg, Doina Precup, Michel Gendreau

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare our solution to a DP-based solution developed by Hydro Qu ebec based on historical inﬂow data, and show both quantitative and qualitative improvement.Sec. 5 presents a quantitative and qualitative analysis of the results
Researcher Affiliation	Collaboration	Yuri Grinberg Doina Precup School of Computer Science, Mc Gill University Montreal, QC, Canada {ygrinb,dprecup}@cs.mcgill.ca Michel Gendreau Ecole Polytechnique de Montr eal Montreal, QC, Canada michel.gendreau@cirrelt.ca NSERC/Hydro-Qu ebec Industrial Research Chair on the Stochastic Optimization of Electricity Generation, CIRRELT and D epartement de Math ematiques et de G enie Industriel, Ecole Polytechnique de Montr eal.
Pseudocode	Yes	Algorithm 1 Policy search algorithm
Open Source Code	No	The paper does not provide any statement or link regarding the public availability of its source code.
Open Datasets	No	Historical data suggests that it is safe to assume that the inﬂows at different sites in the same period t are just scaled values of each other. However, there is relatively little data available to optimize the problem through simulation: there are only 54 years of inﬂow data, which translates into 2808 values (one value per week see Fig. 1). Hydro-Quebec use this data to learn a generative model for inﬂows.
Dataset Splits	No	The estimate of the expected reward of a policy is calculated by running the simulator on a single 2000-year-long trajectory obtained from the generative model described in Sec. 2. All solutions are evaluated on the original historical data. No specific train/validation/test splits are mentioned for a dataset.
Hardware Specification	No	The paper does not provide any specific hardware details for the experiments.
Software Dependencies	No	The paper mentions 'the SAMS software [11]' and that an initial version of the simulator was ported to Java, but no specific version numbers are provided for these or any other software dependencies.
Experiment Setup	Yes	Parameters: N maximum number of interations θ = {θR2, θR3, θR4} = {θ1, ..., θm} Rm initial parameter vector n number of parallel policy evaluations Threshold signiﬁcance threshold γ sampling variance. The estimate of the expected reward of a policy is calculated by running the simulator on a single 2000-year-long trajectory obtained from the generative model described in Sec. 2. Since the algorithm depends on the initialization of the parameter vector, we sample the initial parameter vector uniformly at random and repeat the search 50 times.