Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Reinforcement Learning for Branch-and-Bound Optimisation Using Retrospective Trajectories
Authors: Christopher W. F. Parsonson, Alexandre Laterre, Thomas D. Barrett
AAAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments on four combinatorial tasks, our approach enables learning-to-branch without any expert guidance or pre-training. We outperform the current stateof-the-art RL branching algorithm by 3-5 and come within 20% of the best IL method s performance on MILPs with 500 constraints and 1000 variables, with ablations verifying that our retrospectively constructed trajectories are essential to achieving these results. |
| Researcher Affiliation | Collaboration | Christopher W. F. Parsonson1*, Alexandre Laterre2, Thomas D. Barrett2 1UCL 2Insta Deep |
| Pseudocode | No | The paper mentions 'corresponding algorithms and hyperparameters used' in Appendix A.1, but the appendix content, which might contain pseudocode or algorithm blocks, is not provided in the main paper text. |
| Open Source Code | Yes | All code for reproducing the experiments and links to the generated data sets are provided at https://github.com/cwfparsonson/retro_branching. |
| Open Datasets | Yes | MILP Problem classes. In total, we considered four NP-hard problem benchmarks: set covering (Balas, Ho, and Center 2018), combinatorial auction (Leyton-Brown, Pearson, and Shoham 2000), capacitated facility location (Litvinchev and Ozuna Espinosa 2012), and maximum independent set (Bergman et al. 2016). |
| Dataset Splits | Yes | To evaluate the quality of the agents branching decisions, we used 100 validation instances (see Appendix C for an analysis of this data set size) which were unseen during training, reporting the total number of tree nodes and LP iterations as key metrics to be minimised. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running experiments were mentioned in the paper. |
| Software Dependencies | Yes | We used the open-source Ecole (Prouvost et al. 2020) and Py SCIPOpt (Maher et al. 2016) libraries with SCIP 7.0.1 (SCIP 2022) as the backend solver to do instance generation and testing. |
| Experiment Setup | No | The paper states 'see Appendix A.1 for a detailed description of our RL approach and the corresponding algorithms and hyperparameters used,' indicating that hyperparameter details are not present in the main text. |