reproducibilityindex.ai

End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

Authors: Yves Rychener, Daniel Kuhn, Tobias Sutter

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical results for a synthetic newsvendor problem illustrate the key differences between alternative training schemes. We also investigate an economic dispatch problem based on real data to showcase the impact of the neural network architecture of the decision maps on their test performance.
Researcher Affiliation	Academia	1Risk Analytics and Optimization Chair, Ecole Polytechnique F ed erale de Lausanne, Switzerland 2Department of Computer and Information Science, University of Konstanz, Germany.
Pseudocode	Yes	Algorithm 1 End-to-End Learning for k 1, . . . , K do gk wℓ(Yk, mw(Xk))\|w=wk 1 wk wk 1 ηkgk end for
Open Source Code	Yes	Implementation details are given in Appendix C, and the code underlying all experiments is provided on Git Hub.1
Open Datasets	Yes	We use historical wind power production and weather records2 as samples from P(X,Y )....2https://www.kaggle.com/datasets/theforcecoder/wind-power-forecasting
Dataset Splits	No	The dataset covers the period from 1 January 2018 to 30 March 2020. After removing corrupted samples, the period from 1 January 2018 to 31 December 2019 comprises 59,532 records, which we use as the training set. The remaining records are used for testing. No explicit mention of a validation set is found.
Hardware Specification	No	The paper does not provide specific details on the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions using the Adam optimizer (Kingma & Ba, 2015) and neural network architectures, but it does not specify version numbers for any software dependencies like Python, specific machine learning frameworks (e.g., PyTorch, TensorFlow), or other libraries.
Experiment Setup	Yes	We solve the resulting instance of (5) using Algorithm 1 with K = 5 * 10^6 training samples. The Batch-SGD algorithm runs over 50,000 iterations with 100 samples per batch to reduce the variance of the gradient updates. ... The neural network-based predictions bµNN are compared against the sample mean bµERM and posterior mean bµMMSE. ... (CAL): The CAL architecture consists of a feature extractor that maps the observation X to a 6-dimensional feature R and a prescriptor that maps R into the feasible set A. The feature extractor involves one hidden layer with 64 neurons and Re LU activation functions and an output layer with 6 neurons and Sigmoid activation functions, which determine the output of each generator as a percentage of its capacity.