Reinforcement Learning for Integer Programming: Learning to Cut

Authors: Yunhao Tang, Shipra Agrawal, Yuri Faenza

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluation. We evaluate our approach over a range of classes of IP problems (namely, packing, binary packing, planning, and maximum cut). Our experiments demonstrate significant improvements in solution accuracy as compared to popular human designed heuristics for adding Gomory s cuts.
Researcher Affiliation Academia Yunhao Tang 1 Shipra Agrawal 1 Yuri Faenza 1 1Columbia University, New York, USA. Correspondence to: yt2541@coluymbia.edu <Yunhao>.
Pseudocode Yes Algorithm 1 Rollout of the Policy
Open Source Code No The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets No We use randomly generated problem instances for training and testing the RL agent for each IP problem class. Importantly, note that we do not need solved" (aka labeled) instances for training. RL only requires repeated rollouts on training instances.
Dataset Splits No We use randomly generated problem instances for training and testing the RL agent for each IP problem class. It describes training and testing on instances, but doesn't specify a separate validation dataset split with percentages or counts.
Hardware Specification No The paper does not specify any particular hardware components such as CPU models, GPU models, or memory specifications used for running the experiments.
Software Dependencies Yes We implement the MDP simulation environment for RL using Gurobi (Gurobi Optimization, 2015) as the LP solver.
Experiment Setup Yes We add T = 50 cuts for the first set of instances, and T = 250 cuts for the second set of instances." and "Our B&C procedure has two hyperparameters: number of child nodes (suproblems) to expand Nexp and number of cutting planes added to each node Ncuts." and "Further implementation details, along with hyper-parameter settings for the RL method are provided in the appendix.