Dual Lagrangian Learning for Conic Optimization
Authors: Mathieu Tanneau, Pascal Van Hentenryck
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper presents Dual Lagrangian Learning (DLL), a principled learning methodology for dual conic optimization proxies. [...] The effectiveness of DLL is demonstrated on linear and nonlinear conic optimization problems. The proposed methodology significantly outperforms a state-of-the-art learning-based method, and achieves 1000x speedups over commercial interior-point solvers with optimality gaps under 0.5% on average. [...] Section 5 reports numerical results. |
| Researcher Affiliation | Academia | Mathieu Tanneau, Pascal Van Hentenryck H. Milton Steward School of Industrial and Systems Engineering NSF AI Institute for Advances in Optimization Georgia Institute of Technology {mathieu.tanneau,pascal.vanhentenryck}@isye.gatech.edu |
| Pseudocode | No | No explicit pseudocode or algorithm block found. |
| Open Source Code | Yes | The code used for experiments is available under an open-source license.1 1https://github.com/AI4OPT/Dual Lagrangian Learning |
| Open Datasets | Yes | For each number of items n {100, 200, 500} and number of resources m {5, 10, 30}, a total of 16384 instances are generated using the same procedure as the MIPLearn library [SXQG+23]. [...] This dataset is split in training, validation and testing sets, which contain 8192, 4096 and 4096 instances, respectively. |
| Dataset Splits | Yes | This dataset is split in training, validation and testing sets, which contain 8192, 4096 and 4096 instances, respectively. |
| Hardware Specification | Yes | All experiments are conducted on the Phoenix cluster [PAC17] with Intel Xeon Gold 6226@2.70GHz + Tesla V100 GPU nodes; each job was allocated 1 GPU, 12 CPU cores and 64GB of RAM. |
| Software Dependencies | Yes | All ML models are formulated and trained using Flux [ISF+18]; unless specified otherwise, all (sub)gradients are computed using the auto-differentiation backend Zygote [Inn18]. All linear problems are solved with Gurobi v10 [GO18]. All nonlinear conic problems are solved with Mosek [MOS23b]. |
| Experiment Setup | Yes | All ML models are trained in a self-supervised fashion following the training scheme outlined in Section 4.3, and training is performed using the Adam optimizer [KB15]. The training scheme uses a patience mechanism where the learning rate η is decreased by a factor 2 if the validation loss does not improve for more than Np epochs. The initial learning rate is η = 10 4. Training is stopped if either the learning rate reaches ηmin = 10 7, or a maximum Nmax epochs is reached. [...] For the output layer, a negated softplus activation ensures y 0. The dual completion procedure follows Example (1). Hyperparameters The patience parameter is Np = 32, and the maximum number of training epochs is Nmax = 1024. |