Generalization Bounds in the Predict-then-Optimize Framework

Authors: Othman El Balghiti, Adam N. Elmachtoub, Paul Grigas, Ambuj Tewari

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work, we provide an assortment of generalization bounds for the SPO loss function. In particular, we derive bounds based on the Natarajan dimension that, in the case of a polyhedral feasible region, scale at most logarithmically in the number of extreme points, but, in the case of a general convex set, have poor dependence on the dimension. By exploiting the structure of the SPO loss function and an additional strong convexity assumption on the feasible region, we can dramatically improve the dependence on the dimension via an analysis and corresponding bounds that are akin to the margin guarantees in classification problems.
Researcher Affiliation Collaboration Othman El Balghiti Rayens Capital Chicago, IL 60606 oe2161@columbia.edu Adam N. Elmachtoub Columbia University New York, NY 10027 adam@ieor.columbia.edu Paul Grigas University of California, Berkeley Berkeley, CA 94720 pgrigas@berkeley.edu Ambuj Tewari University of Michigan Ann Arbor, MI 48109 tewaria@umich.edu
Pseudocode No The paper contains mathematical derivations, theorems, and proofs but no pseudocode or algorithm blocks.
Open Source Code No The paper does not mention providing open-source code for the described methodology.
Open Datasets No The paper is theoretical and does not conduct experiments involving training data or datasets.
Dataset Splits No The paper is theoretical and does not involve dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not report on experiments that would require specific hardware specifications.
Software Dependencies No The paper is theoretical and does not report on experiments that would require specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not discuss experimental setups, hyperparameters, or training configurations.