Rethinking the Capacity of Graph Neural Networks for Branching Strategy
Authors: Ziang Chen, Jialin Liu, Xiaohan Chen, Wang, Wotao Yin
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | A small-scale numerical experiment is conducted to directly validate our theoretical findings. We implement numerical experiments to validate our theoretical findings in Section 4. |
| Researcher Affiliation | Collaboration | Ziang Chen Massachusetts Institute of Technology ziang@mit.edu; Jialin Liu University of Central Florida jialin.liu@ucf.edu; Xiaohan Chen Xinshang Wang Wotao Yin Alibaba US, DAMO Academy {xiaohan.chen,xinshang.w,wotao.yin}@alibaba-inc.com |
| Pseudocode | Yes | Algorithm 1 The WL test for MILP-Graphs; Algorithm 2 2-FWL test for MILP-Graphs |
| Open Source Code | Yes | We provide the source code and data to reproduce the main results, along with sufficient instructions to run the code, in the supplementary materials. |
| Open Datasets | No | We randomly generate 100 MILP instances, with 6 constraints and 20 variables... We train the MP-GNN and 2-FGNN to fit the SB scores of (4.1) and (4.2)... |
| Dataset Splits | No | The paper mentions training data and training loss ('average ℓ2 norm of errors across all training instances') but does not explicitly state a validation set or split for hyperparameter tuning. |
| Hardware Specification | Yes | The numerical experiments are conducted on a single NVIDIA Tesla V100 GPU |
| Software Dependencies | Yes | We implement MP-GNN and 2-FGNN with Python 3.6 and Tensor Flow 1.15.0 [1]. |
| Experiment Setup | Yes | We train an MP-GNN and a 2-FGNN with L = 2... p0, q0 are parameterized as linear transformations followed by a non-linear activation function; {pl, ql, f l, gl}L l=1 are parameterized as 3-layer multi-layer perceptrons (MLPs) with respective learnable parameters; and the output mapping r is parameterized as a 2-layer MLP. All layers map their input to a 1024-dimensional vector and use the Re LU activation function. ... minimizing 1 2 P G G Fθ(G) SB(G) 2 with respect to θ, using Adam [31]. ... with a learning rate of 10 5 for all networks. We decay the learning rate to 10 6 and 10 7 when the training error reaches 10 6 and 10 12 respectively. |