Decision-Focused Learning with Directional Gradients

Authors: Michael Huang, Vishal Gupta

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide numerical evidence showing that minimizing our surrogate loss performs comparably to other surrogates when the hypothesis class is well-specified, and substantively outperforms them when the hypothesis class is misspecified.
Researcher Affiliation Academia Vishal Gupta USC Marshall School of Business Los Angeles, CA 90029 guptavis@usc.edu Michael Huang CUNY Baruch Zicklin School of Business New York, NY 10010 michael.huang@baruch.cuny.edu
Pseudocode No The paper describes methods in text and mathematical formulas but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes Our supplemental materials provide python code that leverages the (public) package Py EPO (https://github.com/khalil-research/Py EPO). Together one can generate both the data used in our experiments, run our algorithm and each of the benchmarks. All experiments are also described in detail in the main body (Section 4), with some implementation specific details relegated to the appendix.
Open Datasets Yes We generate synthetic data as Y = f (X) + ϵα. We define ϵα = α (ζ 0.5) + 1 α γ where α [0, 1], ζ is an exponential random variable with mean 0.5, and γ N(0, 0.25). By construction ϵ is mean-zero noise with variance 0.25. The value of α = 0, ϵ controls how asymmetric the noise is. Note, when α = 0, the theoretical performance guarantees on SPO+ from [19] do not apply.
Dataset Splits Yes All methods are trained for a total of 100 epochs, and we select the best model found in those 100 epochs based on validation set of size 200. For PG losses, we initialized at the SPO+ solution and choose h from a small grid of values based on validation set performance.
Hardware Specification Yes A significant portion of the experiments in the paper (that did not require multiple Monte Carlo runs) were run on a Macbook Pro with an Apple M3 Max Chip with 96 GB Memory.
Software Dependencies Yes For our numerical experiments we leverage the Py EPO framework which was developed using Py Torch.
Experiment Setup Yes We optimize each surrogate using ADAM via the Py EPO framework. All methods are trained for a total of 100 epochs, and we select the best model found in those 100 epochs based on validation set of size 200. For PG losses, we initialized at the SPO+ solution and choose h from a small grid of values based on validation set performance. Future computational experiments might explore the effect of alternate initializations. We do not add additional regularization or smoothing to any of the surrogates. See Appendix C for other implementation details.