reproducibilityindex.ai

Rethinking the Capacity of Graph Neural Networks for Branching Strategy

Authors: Ziang Chen, Jialin Liu, Xiaohan Chen, Wang, Wotao Yin

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	A small-scale numerical experiment is conducted to directly validate our theoretical findings. We implement numerical experiments to validate our theoretical findings in Section 4.
Researcher Affiliation	Collaboration	Ziang Chen Massachusetts Institute of Technology ziang@mit.edu; Jialin Liu University of Central Florida jialin.liu@ucf.edu; Xiaohan Chen Xinshang Wang Wotao Yin Alibaba US, DAMO Academy {xiaohan.chen,xinshang.w,wotao.yin}@alibaba-inc.com
Pseudocode	Yes	Algorithm 1 The WL test for MILP-Graphs; Algorithm 2 2-FWL test for MILP-Graphs
Open Source Code	Yes	We provide the source code and data to reproduce the main results, along with sufficient instructions to run the code, in the supplementary materials.
Open Datasets	No	We randomly generate 100 MILP instances, with 6 constraints and 20 variables... We train the MP-GNN and 2-FGNN to fit the SB scores of (4.1) and (4.2)...
Dataset Splits	No	The paper mentions training data and training loss ('average ℓ2 norm of errors across all training instances') but does not explicitly state a validation set or split for hyperparameter tuning.
Hardware Specification	Yes	The numerical experiments are conducted on a single NVIDIA Tesla V100 GPU
Software Dependencies	Yes	We implement MP-GNN and 2-FGNN with Python 3.6 and Tensor Flow 1.15.0 [1].
Experiment Setup	Yes	We train an MP-GNN and a 2-FGNN with L = 2... p0, q0 are parameterized as linear transformations followed by a non-linear activation function; {pl, ql, f l, gl}L l=1 are parameterized as 3-layer multi-layer perceptrons (MLPs) with respective learnable parameters; and the output mapping r is parameterized as a 2-layer MLP. All layers map their input to a 1024-dimensional vector and use the Re LU activation function. ... minimizing 1 2 P G G Fθ(G) SB(G) 2 with respect to θ, using Adam [31]. ... with a learning rate of 10 5 for all networks. We decay the learning rate to 10 6 and 10 7 when the training error reaches 10 6 and 10 12 respectively.