Neur2BiLO: Neural Bilevel Optimization
Authors: Justin Dumouchelle, Esther Julien, Jannis Kurtz, Elias Khalil
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through a series of experiments on (i) the bilevel knapsack interdiction problem, (ii) the critical node problem from network security, (iii) a donor-recipient healthcare problem, and (iv) the DNDP, we will show that NEUR2BILO is easy to train and produces, very quickly, heuristic solutions that are competitive with state-of-the-art methods. |
| Researcher Affiliation | Academia | Justin Dumouchelle University of Toronto Esther Julien TU Delft Jannis Kurtz University of Amsterdam Elias B. Khalil University of Toronto |
| Pseudocode | Yes | Algorithm 1 NEUR2BILO Data Collection and Training" and "Algorithm 2 NEUR2BILO Optimization" are present in Appendix B. |
| Open Source Code | Yes | Our code and data are available at https://github.com/khalil-research/Neur2Bi LO. |
| Open Datasets | Yes | For KIP, CNP, DRP, we sample 1,000 instances according to the procedures specified in Tang et al. [54], Dragotto et al. [18], and Ghatkar et al. [27], respectively. |
| Dataset Splits | No | However, if the validation mean absolute error does not improve in 200 iterations, we terminate early. |
| Hardware Specification | Yes | The experiments for the benchmarks were run on a computing cluster with an Intel Xeon CPU E5-2683 and Nvidia Tesla P100 GPU with 64GB of RAM (for training). |
| Software Dependencies | Yes | Pytorch 2.0.1 [44] was used for all neural network models and scikit-learn 1.4.0 was used for gradient-boosted trees in the DNDP [46]. Gurobi 11.0.1 [30] was used as the MILP solver and gurobi-machinelearning 1.4.0 was used to embed the learning models into MILPs. |
| Experiment Setup | Yes | The sub-networks Ψd, Ψs, Ψv are feed-forward networks with one hidden layer of dimension 128. The decision-independent feature embedding dimension (m) is 64, and the instance embedding dimension (k) is 32. We use a batch size of 32, a learning rate of 0.01, and Adam [31] as an optimizer. |