Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

BIPNN: Learning to Solve Binary Integer Programming via Hypergraph Neural Networks

Authors: Sen Bai, Chunqi Yang, Xin Bai, Xin Zhang, Zhengang Jiang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on synthetic and real-world datasets highlight the superiority of our approach. 6 Experimental Results In this section, we describe our empirical experiments on BIPNN and baseline optimization tools. Our source codes can be obtained on Git Hub6. Benchmarks. To evaluate BIPNN on BIP problems with diverse scales, the datasets are generated using DHG library7. To evaluate the quality of solutions and computational efficiency of BIPNN, datasets of varying scales are generated in three steps: Initially, DHG library is applied to generate hypergraph structures (where \|E\| = 2\|V \|). Subsequently, a random coefficient is assigned to each hyperedge (representing a polynomial term) to generate PUBO objective functions. Thereafter, several constraints (penalty terms) were randomly incorporated into the PUBO objectives. To demonstrate the effectiveness of BIPNN on real-world settings, we also conduct experiments on the hypergraph max-cut problem (refer to Appendix E), a well-known BIP problem benchmark. Moreover, we conduct experiments on publicly-available hypergraph datasets (refer to Appendix F.1). Figure 4: Comparison of BIPNN and BIP solvers such as Gurobi and SCIP. d is the degree of polynomial terms in BIP objective functions. (a)(b) show the solving time required for BIPNN, SCIP and Gurobi to obtain the same solution. (c)(d) show the ratio of the solutions of BIPNN to SCIP; (e)(f) illustrate the ratio of the solutions of BIPNN to Gurobi; Runtime is restricted to half an hour.
Researcher Affiliation	Collaboration	Sen Bai EMAIL Chunqi Yang EMAIL Xin Bai EMAIL Xin Zhang EMAIL Zhengang Jiang EMAIL School of Computer Science and Technology, Changchun University of Science and Technology. Huawei Technologies Co. Ltd. Corresponding author
Pseudocode	No	The paper describes methods and steps in prose (e.g., Section 3.2, "GPU-accelerated Training" and Section 4.2, "Polynomial Penalty") but does not include any clearly labeled algorithm or pseudocode blocks.
Open Source Code	Yes	Our source codes can be obtained on Git Hub6. 6https://github.com/Classpi/BIPNN-Learning-to-Solve-Binary-Integer-Programming-via-Hypergraph-Neural-Networks
Open Datasets	Yes	To evaluate BIPNN on BIP problems with diverse scales, the datasets are generated using DHG library7. ... Moreover, we conduct experiments on publicly-available hypergraph datasets (refer to Appendix F.1). 7https://deephypergraph.readthedocs.io/en/latest/index.html. Table 4: Summary statistics of six real-world graphs: the number of vertices \|V \|, the number of edges \|E\|. Four hypergraphs: the number of vertices \|V \|, the number of hyperedges \|E\|, the size of the hypergraph P e E \|e\|. Graphs \|V \| \|E\| Hypergraphs \|V \| \|E\| P BAT 131 1,003 Primary 242 12,704 30,729 EAT 399 5,993 High 327 7,818 18,192 UAT 1,190 13,599 Cora 1,330 1,503 4,599 DBLP 2,591 3,528 Pub Med 3,824 7,523 33,687 Cite Seer 3,279 4,552 Amz Photo 7,535 119,081
Dataset Splits	No	The paper describes generating synthetic datasets and using publicly available real-world graph/hypergraph datasets, but it does not specify explicit training, validation, or test splits (e.g., percentages, sample counts, or predefined partitions) for these datasets in the context of the experiments.
Hardware Specification	Yes	Experiments are conducted on an Intel Core i9-12900K CPU with 24 cores, and an NVIDIA Ge Force RTX 3090 GPU with 24 G of memory.
Software Dependencies	No	We adopt two-layer HGNN+ [17] as the Hyper GNN model for the experiments. It leverages Py Torch-supported broadcasting mechanisms to distribute the vector of decision variables across each monomial (hyperedge), enabling parallel computation. The paper also mentions Gurobi and SCIP as baseline methods. However, specific version numbers for these software dependencies are not provided in the paper's experimental setup.
Experiment Setup	Yes	We adopt two-layer HGNN+ [17] as the Hyper GNN model for the experiments. (Section 6.1) The degrees of vertices are set to 4 (Fig. 4a) and 6 (Fig. 4b) respectively. (Section 6.3) The number of variables ranges from 100 to 2000. The degrees of polynomial terms d are set to d = 4 and d = 6 respectively. We perform 10 tests each time and record the average value of the cut numbers. (Section 6.4) We train BIPNN for a fixed number of 1000 epochs. (Annealing Strategy) The penalty strength γ is set to 2.5 initially and its value is gradually increased during training. The value of γ reaches 0 after 500 epochs and continued to increase thereafter. (Appendix C) Specifically, we can use µl = µ0 + η ql where µ0 is the initial value of the penalty factor µl, η is the learning rate, and ql is the value of the penalty term. We can initialize µ0 with a small value (e.g., 2 5) to broaden the model s search space. The learning rate η should be experimentally tuned across different values (e.g., 0.001 1).