Learning Large Neighborhood Search Policy for Integer Programming

Authors: Yaoxin Wu, Wen Song, Zhiguang Cao, Jie Zhang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform experiments in this section on four NP-hard benchmark problems: Set Covering (SC), Maximal Independent Set (MIS), Combinatorial Auction (CA) and Maximum Cut (MC), which are widely used in existing works.
Researcher Affiliation Collaboration Yaoxin Wu SCALE@NTU Corp Lab Nanyang Technological University, Singapore wuyaoxin@ntu.edu.sg Wen Song Shandong University Qingdao, China wensong@email.sdu.edu.cn Zhiguang Cao Singapore Institute of Manufacturing Technology A*STAR, Singapore zhiguangcao@outlook.com Jie Zhang Nanyang Technological University Singapore zhangj@ntu.edu.sg
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes Our code is available.3 [Footnote 3: https://github.com/WXY1427/Learn-LNS-policy]
Open Datasets Yes We generate SC instances with 1000 columns and 5000 rows following the procedure in [42]. MIS instances are generated following [43], where we use the Erd os-Rényi random graphs with 1500 nodes and set the affinity number to 4. CA instances with 2000 items and 4000 bids are generated according to arbitrary relationships in [44]. MC instances are generated according to Barabasi-Albert random graph models [45], with average degree 4, and we adopt graphs of 500 nodes.
Dataset Splits Yes For each problem type, we generate 100, 20, 50 instances for training, validation, and testing.
Hardware Specification Yes We run all experiments on an Intel(R) Xeon(R) E5-2698 v4 2.20GHz CPU.
Software Dependencies Yes We use the state-of-the-art open source IP solver SCIP (v6.0.1) [46] as the repair operator... Here we evaluate its performance by leveraging Gurobi (v9.0.3) [47] as the repair operator.
Experiment Setup Yes For each problem, we train 200 iterations, during each we randomly draw M =10 instances. We set the training step limit T=50, 50, 70, 100 for SC, MIS, CA and MC respectively. The time limit for repair at each step is 2 seconds, unless stated otherwise. We use ϵ=0.2 for probability clipping. For the Q-actor-critic algorithm, we set the length of the experience replay TM, the number of updating the network U=4 and the batch size B=TM/U. We set the discount factor γ=0.99 for all problems, and use Adam optimizer with learning rate 1 10 4.