Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Principled Data Augmentation for Learning to Solve Quadratic Programming Problems

Authors: Chendi Qian, Christopher Morris

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our approach improves generalization in supervised scenarios and facilitates effective transfer learning to related optimization problems.
Researcher Affiliation	Academia	Chendi Qian Christopher Morris Department of Computer Science RWTH Aachen University Aachen, Germany EMAIL
Pseudocode	Yes	Algorithm 1 Generate a sparse feasible LP instance. Algorithm 2 Generate a sparse feasible QP instance. Algorithm 3 Generate soft-margin SVM QP instance. Algorithm 4 Generate mean-variance portfolio QP instance. Algorithm 5 Generate LASSO QP instance.
Open Source Code	Yes	The repository of our source code can be accessed at https://github.com/chendiqian/ Data-Augmentation-for-Learning-to-Optimize.
Open Datasets	Yes	We empirically evaluate our approach on synthetic and benchmark datasets, showing that pretraining with our augmentations improves generalization and transferability across problem classes. For LPs, we generate four types of relaxed LP instances derived from MILPs: Set Cover (SC), Maximum Independent Set (MIS), Combinatorial Auction (CA), and Capacitated Facility Location (CFL), following Gasse et al. [2019]. For QPs, we generate instances of soft-margin SVM, Markowitz portfolio optimization, and LASSO regression following Jung et al. [2022]. If possible, problem sizes and densities are kept similar to the pretraining datasets; see more details in Appendix G. Benchmarks such as MIPLIB [Gleixner et al., 2021] and QPLIB [Furini et al., 2019] contain too few samples and exhibit significant variability in problem size and structure, making them unsuitable for training deep models.
Dataset Splits	Yes	We generate 10 000 instances for each dataset, each with 100 variables and 100 inequality constraints, and split the data into training, validation, and test sets with 8 : 1 : 1 ratio.
Hardware Specification	Yes	All experiments are conducted on a single NVIDIA L40S GPU. Training takes around 30 seconds per epoch using 4 NVIDIA L40S GPUs.
Software Dependencies	No	The paper mentions 'SciPy make_sparse_spd_matrix' in Algorithm 2 for generating sparse positive semi-definite matrices, but does not provide specific version numbers for this or any other software dependencies.
Experiment Setup	Yes	Models are trained with a batch size of 32 for up to 2000 epochs, with early stopping after 200 epochs of patience. We evaluate performance under data scarcity by training models on subsets containing 10%, 20%, and 100% of the training data. As baselines, we apply the augmentations proposed by You et al. [2020]: node dropping, edge perturbation, and feature masking. ... Specifically, we use a 6-layer MPNN followed by a 3-layer MLP with 192 hidden dimensions and Graph Norm [Cai et al., 2021]. ... We pretrain for 800 epochs with a batch size of 128, and set τ = 0.1. To assess pretraining quality and pick the best set of hyperparameters, we use linear probing [Veliˇckovi c et al., 2018], training only a linear regression layer on top of a frozen MPNN to efficiently evaluate feature quality. For finetuning, we follow Zeng and Xie [2021], attaching an MLP head and jointly training it with the MPNN using supervised regression loss.