reproducibilityindex.ai

DOGE-Train: Discrete Optimization on GPU with End-to-End Training

Authors: Ahmed Abbas, Paul Swoboda

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	For training we use an unsupervised loss. We train on smaller problems and test on larger ones showing strong generalization performance with a GNN comprising only around 10k parameters.
Researcher Affiliation	Academia	Ahmed Abbas1, Paul Swoboda1,2,3 1Max Planck Institute for Informatics, Saarland Informatics Campus, 2University of Mannheim, 3Heinrich-Heine University Düsseldorf
Pseudocode	Yes	Algorithm 1: Generalized Min-Marginal Averaging; Algorithm 2: Block Update backpropagation; Algorithm 3: Parameter prediction by GNN
Open Source Code	Yes	1Code available at https://github.com/LPMP/BDD
Open Datasets	Yes	Cell tracking (CT): Instances of developing flywing tissue from cell tracking challenge (Ulman et al. 2017) processed by (Haller et al. 2020) and obtained from (Swoboda et al. 2022). Graph matching (GM): Instances of graph matching for matching nuclei in 3D microscopic images (Long et al. 2009) processed by (Kainmueller et al. 2014) and made publicly available through (Swoboda et al. 2022) as ILPs. Independent set (IS): Random instances of independent set problem generated using (Prouvost et al. 2020). QAPLib: The benchmark dataset for quadratic assignment problems used in the combinatorial optimization community (Burkard, Karisch, and Rendl 1997).
Dataset Splits	No	The paper specifies train and test splits for each dataset (e.g., 'train on the 2 smaller instances and test on the largest one' for Cell Tracking), but it does not explicitly mention a separate validation split or how it was handled.
Hardware Specification	Yes	CPU solvers use AMD EPYC 7702 CPU with 16 threads. GPU solvers use one of RTX 8000 (48GB) or A100 (80GB) GPU depending on problem size.
Software Dependencies	Yes	For training we use Py Torch and implement Alg. 1,2 in CUDA. Gurobi: The dual simplex algorithm from the commercial solver (Gurobi Optimization, LLC 2021). Cplex, IBM ILOG 2019; MOSEK Ap S 2022
Experiment Setup	Yes	Due to varying instance sizes we use a separate set of hyperparameters for each dataset given in Table 5 of the Appendix2. Size of the learned embeddings h computed by the GNN in Alg. 3 is set to 16 for nodes and 8 for edges. For computing attention weights in (3) we use only one attention head for efficiency. The predictor Φ in Alg. 3 contains 4 linear layers with the Re LU activation. We train the networks using the Adam optimizer (Kingma and Ba 2014). To prevent gradient overflow we use gradient clipping on model parameters by an l2 norm of 50.