On Representing Mixed-Integer Linear Programs by Graph Neural Networks
Authors: Ziang Chen, Jialin Liu, Xinshang Wang, Wotao Yin
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted small-scale numerical experiments to validate our theoretical findings. and In this section, we experimentally validate our theories on some small-scale examples with m = 6 and n = 20. We first randomly generate two datasets D1 and D2. |
| Researcher Affiliation | Collaboration | Ziang Chen Department of Mathematics, Duke University Durham, NC 27708 ziang@math.duke.edu Jialin Liu Damo Academy, Alibaba US Bellevue, WA 98004 jialin.liu@alibaba-inc.com Xinshang Wang Damo Academy, Alibaba US Antai College of Economics and Management, Shanghai Jiao Tong University Bellevue, WA 98004 xinshang.w@alibaba-inc.com Jianfeng Lu Departments of Mathematics, Physics, and Chemistry, Duke University Durham, NC 27708 jianfeng@math.duke.edu Wotao Yin Damo Academy, Alibaba US Bellevue, WA 98004 wotao.yin@alibaba-inc.com |
| Pseudocode | Yes | Algorithm 1 WL test for MILP-Graphs (denoted by WLMILP) Require: A graph instance (G, H) Gm,n HV m HW n and iteration limit L > 0. 1: Initialize with C0,V i = HASH0,V (h V i ), C0,W j = HASH0,W (h W j ). 2: for l = 1, 2, , L do 3: Cl,V i = HASHl,V Cl 1,V i , Pn j=1 Ei,j HASH l,W Cl 1,W j . 4: Cl,W j = HASHl,W Cl 1,W j , Pm i=1 Ei,j HASH l,V Cl 1,V i . 5: end for 6: return The multisets containing all colors {{CL,V i }}m i=0, {{CL,W j }}n j=0. |
| Open Source Code | Yes | Our codes are modified from Gasse et al. (2019) and released to https: //github.com/liujl11git/GNN-MILP.git. |
| Open Datasets | No | Each instance in D1 has 20 variables, 6 constraints and is generated with: For each variable, cj N(0, 0.01), lj, uj N(0, 10). If lj > uj, then switch lj and uj. The probability that xj is an integer variable is 0.5. For each constraint, i U({ , =, }) and bi N(0, 1). A has 60 nonzero elements with each nonzero element distributing as N(0, 1). Each instance in D2 has 20 variables, 6 equality constraints, and we construct the (2k 1)-th and 2k-th problems via following approach (1 k 500) Sample J = {j1, j2, . . . , j6} as a random subset of {1, 2, . . . , 20} with 6 elements. |
| Dataset Splits | No | All the results reported in this section are obtained on the training sets, not an separate testing set. Generalization tests and details of the numerical experiments can be found in the appendix. and The size of the training set is chosen from {10, 100, 1000} and the size of the testing set is 1000. The paper mentions training and testing sets but does not specify a separate validation split or how it was used. |
| Hardware Specification | Yes | All the experiments are conducted on a Linux server with an Intel Xeon Platinum 8163 GPU and eight NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | We use Adam (Kingma & Ba, 2014) as our training optimizer with learning rate of 0.0001. and In our GNNs, we set the number of message-passing layers as L = 2 and parameterize all the learnable functions f V in, f W in , fout, f W out, {f V l , f W l , g V l , g W l }L l=0 as multilayer perceptrons (MLPs). Our codes are modified from Gasse et al. (2019)... and We call SCIP (Bestuzheva et al., 2021a;b), a state-of-the-art non-commercial MILP solver. While software components are mentioned, specific version numbers for these software dependencies (e.g., PyTorch, TensorFlow, etc.) are not provided. |
| Experiment Setup | Yes | In our GNNs, we set the number of message-passing layers as L = 2 and parameterize all the learnable functions f V in, f W in , fout, f W out, {f V l , f W l , g V l , g W l }L l=0 as multilayer perceptrons (MLPs)... All the learnable functions... have two hidden layers. The embedding size d0, , d L are uniformly taken as d that is chosen from {2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048}. All the activation functions are Re LU. We use Adam (Kingma & Ba, 2014) as our training optimizer with learning rate of 0.0001. The loss function is taken as mean squared error. |