SurCo: Learning Linear SURrogates for COmbinatorial Nonlinear Optimization Problems

Authors: Aaron M Ferber, Taoan Huang, Daochen Zha, Martin Schubert, Benoit Steiner, Bistra Dilkina, Yuandong Tian

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Sur Co in three settings: embedding table sharding (Zha et al., 2022a), photonic inverse design (Schubert et al., 2022), and nonlinear route planning (Fan et al., 2005). In the on-the-fly setting, Sur Co-zero achieves higher quality solutions in comparable or less runtime, thanks to the help of an efficient combinatorial solver. in Sur Co-prior, our method obtains better solutions in heldout problems compared to other methods that require training (e.g., reinforcement learning).
Researcher Affiliation Collaboration Aaron Ferber 1 Taoan Huang 1 Daochen Zha 2 Martin Schubert 3 Benoit Steiner 4 Bistra Dilkina 1 Yuandong Tian 3 ... 1Center for AI in Society, University of Southern California 2Rice University 3Meta AI, FAIR 4Anthropic.
Pseudocode Yes C. Pseudocode. Here is the pseudocode for the different variants of our algorithm. Each of these leverage a differentiable optimization solver to differentiate through the surrogate optimization problem. Algorithm 1 Sur Co-zero. Algorithm 2 Sur Co-prior Training. Algorithm 3 Sur Co-prior Deployment. Algorithm 4 Sur Co-hybrid.
Open Source Code No The paper mentions a project page at "https://sites.google.com/usc.edu/surco/" and provides a link to Dreamshard's code (a baseline) at "https://github.com/daochenzha/dreamshard". However, it does not explicitly state that the source code for the Sur Co method described in this paper is openly available at a specific repository or in supplementary materials.
Open Datasets Yes We evaluate Sur Co on the public Deep Learning Recommendation Model (DLRM) dataset (Naumov et al., 2019). ... Researchers also develop the Ceviche Challenges (Schubert et al., 2022) a standard benchmark of inverse photonic design problems.
Dataset Splits No The paper specifies training and test splits for the datasets (e.g., "50 train, 50 test" for DLRM and "25 training/25 test" for photonic design) but does not explicitly mention a separate validation split for hyperparameter tuning or early stopping.
Hardware Specification Yes Experiments are performed on a cluster of identical machines, each with 4 Nvidia A100 GPUs and 32 CPU cores, with 1T of RAM and 40GB of GPU memory.
Software Dependencies No The paper mentions several software components like Python, PyTorch, CVXPYLayers, SCIP, Adam optimizer, JAX, and PyGAD, and cites their original papers/sources. However, it does not provide specific version numbers for these libraries or the programming language itself (e.g., "Python 3.x" or "PyTorch 1.x").
Experiment Setup Yes For embedding table placement, the nonlinear cost estimator is trained for 200 iterations and the offline-trained models of Dreamshard and Sur Co-prior are trained against the pretrained cost estimator for 200 iterations. ... The architecture is trained with the Adam optimizer with learning rate 0.0005. ... The architecture is trained with the Adam optimizer with learning rate 0.001. ... Impromptu solvers stop when they either find a valid solution, or reach a maximum of 200 steps.