A Surrogate Objective Framework for Prediction+Programming with Soft Constraints

Authors: Kai Yan, Jie Yan, Chuan Luo, Liting Chen, Qingwei Lin, Dongmei Zhang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method in three applications extended with soft constraints: synthetic linear programming, portfolio optimization, and resource provisioning, demonstrating that our method outperforms traditional two-staged methods and other decision-focused approaches.
Researcher Affiliation Collaboration Kai Yan Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 kaiyan3@illinois.edu; Jie Yan Microsoft Research Beijing, China jiey@microsoft.com
Pseudocode Yes The sketch of our algorithm is outlined in Appendix C.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets Yes The prediction dataset is daily price data of SP500 from 2004 to 2017 downloaded by Quandl API [24] with the same settings in [3].
Dataset Splits No We train each method for 40 epochs, and early stop when valid performance degrades for 4 consecutive epochs. This implies the use of a validation set, but the paper does not specify the explicit size or percentage of this split or how it was generated from the total dataset, apart from mentioning the test set for the ERCOT dataset.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only discusses software and training parameters.
Software Dependencies No The paper mentions software like PyTorch [17], Gurobi [6], CPLEX [7], Ada Grad [23], and ReLU [22], but does not provide specific version numbers for these software dependencies, making the setup not fully reproducible based on version.
Experiment Setup Yes All five methods use the same prediction model a fully connected neural network of two hidden layers with 128 neurons for each and Re LU [22] for activation. We use Ada Grad [23] as the optimizer, with learning rate 0.01 and gradient clipped at 1e 4. We train each method for 40 epochs, and early stop when valid performance degrades for 4 consecutive epochs.