Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Lookback for Learning to Branch
Authors: Prateek Gupta, Elias Boutros Khalil, Didier Chételat, Maxime Gasse, Andrea Lodi, Yoshua Bengio, M. Pawan Kumar
TMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experimentation on standard benchmark instances, we show that our proposal leads to decreases of up to 22% in the size of the B&B tree and up to 15% in the solving times. |
| Researcher Affiliation | Academia | Prateek Gupta, University of Oxford, The Alan Turing Institute EMAIL Elias B. Khalil, University of Toronto EMAIL Didier Chételat, CERC, Polytechnique Montréal EMAIL Maxime Gasse, Mila, CERC, Polytechnique Montréal EMAIL Yoshua Bengio, Mila, Université de Montréal, CIFAR Fellow EMAIL Andrea Lodi, CERC, Polytechnique Montréal, and Cornell Tech and Technion IIT EMAIL M. Pawan Kumar, University of Oxford EMAIL |
| Pseudocode | No | The paper describes methodologies in prose and mathematical formulations (e.g., Section 4 Methodology) but does not include any clearly labeled pseudocode blocks or algorithms. |
| Open Source Code | No | A well commented and functional code to generate these instances can be found on the authors Github repository2. This reference is only for instance generation, not for the core methodology (GNN training with lookback features) described in the paper. |
| Open Datasets | Yes | To evaluate the performance of our proposed methods, we consider four benchmark problem families: Combinatorial Auctions, Minimum Set Covering, Capacitated Facility Location, and Maximum Independent Set. These are the same problems that have been used extensively in the learning to branch literature (Gupta et al., 2020; Scavuzzo et al., 2022; Etheve et al., 2020; Sun et al., 2020) since introduced in Gasse et al. (2019). Specifically, we collect a dataset D for each of the problem families by solving the Small instances using SCIP (Gleixner et al., 2018) with the strong branching heuristic. |
| Dataset Splits | Yes | For each of the problem family, we generate 10,000 Small random instances to collect training data, 2,000 Small random instances to collect the validation data, and 20 Medium instances for model selection. We collected a total of 150,000 training observations and 30,000 validation observations. |
| Hardware Specification | Yes | The evaluation on Big instances is performed by using SCIP 6.0.1 installed on an Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz. The learned neural networks, as branching models, are run on NVIDIA-TITAN Xp GPU card with CUDA 10.1. |
| Software Dependencies | Yes | Our models are all implemented in Py Torch (Paszke et al., 2017). Following Gupta et al. (2020), we didn t change any of the training parameters, for example, we used the Adam (Kingma & Ba, 2014) optimizer with the learning rate of 1e 3... The evaluation on Big instances is performed by using SCIP 6.0.1... with CUDA 10.1. |
| Experiment Setup | Yes | Our models are all implemented in Py Torch (Paszke et al., 2017). Following Gupta et al. (2020), we didn t change any of the training parameters, for example, we used the Adam (Kingma & Ba, 2014) optimizer with the learning rate of 1e 3, training batch size of 32, and a learning rate scheduler to reduce the learning rate by a factor of 0.2 in the absence of any improvement in the validation loss for the last 15 epochs Moreover, we use the early stopping criterion to stop the training if the validation loss doesn t improve over 30 epochs. We consider a grid search over the following hyperparameters: (a) loss-target {y, z}, where z refers to the modified loss function proposed in Section 4.1 and y to the typical loss function that focuses only on the top strong branching variable; (b) λl2 {0.0, 0.01, 0.1, 1.0}, the l2-regularization penalty; and (c) λP AT {0.01, 0.1, 0.2, 0.3} to define the strength of the PAT lookback loss term proposed in Section 4.2. |