Predicting Lagrangian Multipliers for Mixed Integer Linear Programs

Authors: Francesco Demelas, Joseph Le Roux, Mathieu Lacroix, Axel Parmentier

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on two widely known problems, Multi-Commodity Network Design and Generalized Assignment, show that our approach closes up to 85 % of the gap between the continuous relaxation and the best Lagrangian bound, and provides a highquality warm-start for descent-based Lagrangian methods.
Researcher Affiliation Academia 1Laboratoire d Informatique de Paris-Nord, Universit e Sorbonne Paris Nord CNRS, France 2CERMICS, Ecole des Ponts, France. Correspondence to: Francesco Demelas <demelas@lipn.fr>, Joseph Le Roux <leroux@lipn.fr>, Mathieu Lacroix <lacroix@lipn.fr>.
Pseudocode No The paper includes architectural diagrams (Figure 1, Figure 2) and describes algorithms in prose, but it does not contain clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code in JULIA at https://github.com/FDemelas/ Learning_Lagrangian_Multipliers.jl
Open Datasets Yes The Canad dataset is the standard and well-established dataset of instances for evaluating MC solvers (Crainic et al., 2001) and Instances e10100 and e20400 are GA instances generated by (Yagiura et al., 1999) and available at http://www.al.cm. is.nagoya-u.ac.jp/ yagiura/gap/.
Dataset Splits Yes We divide each dataset of 2000 instances in train (80%), validation (10%) and test (10%).
Hardware Specification Yes For the training on the datasets MC-SML40, MC-BIG-40, GA-10-100 and GA-20-400 we use GPUs Nvidia Quadro RTX 5000 with 16 GB of RAM. To train the datasets MC-SML-VAR and MC-BIG-VAR we use Nvidia A40 GPUs accelerators with 48Gb of RAM. To test the performance we use Nvidia A40 GPUs accelerators with 48Gb of RAM for all models and all the datasets on validation and test. The warm starting of the proximal Bundle in SMS++ needs only CPU, the experiments are done on Intel Core i7-8565U CPU @ 1.80GHz 8.
Software Dependencies No The paper mentions 'JULIA', 'RAdam', 'CPLEX' (with a link to its product page), 'SMS++' and 'Nearest Neighbors.jl', but does not provide specific version numbers for these software components.
Experiment Setup Yes Model Architecture For all datasets, the MLP F from initial features to high-dimensional is implemented as a linear transformation (8 to 250) followed by a non-linear activation. Then, we consider a linear transformation to the size of the internal representation of nodes for the GNN. For MC we use 5 blocks, while for GA we use only 3. ... The hidden layer of the MLP in the second sub-layer of each block has a size of 1000. The decoder is an MLP with one hidden layer of 250 nodes. All non-linear activations are implemented as Re LU. Only the one for the output of the GA is a softplus. The dropout rate is set to 0.25. Optimiser Specifications We use as optimizer RAdam, with learning rate 0.0001 for MC and 0.00001 for GA, a Clip Norm (to 5) and exponential decay 0.9, step size 100000 and minimum learning rate 10 10.