Adaptive Stabilization Based on Machine Learning for Column Generation

Authors: Yunzhuang Shen, Yuan Sun, Xiaodong Li, Zhiguang Cao, Andrew Eberhard, Guangquan Zhang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically demonstrate the efficacy of ASCG-ML in the graph coloring problem (GCP) (Jensen & Toft, 2011). As illustrated in Figure 1, ASCG-ML allows the dual values to be significantly drawn towards the prediction of the optimal dual solution, compared to standard CG and a stabilization approach (Du Merle et al., 1999). This attraction is particularly strong when the dual values greatly differ from the optimal dual solution, as evidenced by a large step size . As CG progresses, our method consistently yields dual iterates that are in proximity to the optimal dual solutions, in contrast to the considerable fluctuations observed in the dual iterates produced by the compared methods.
Researcher Affiliation Academia 1Australian Artificial Intelligence Institute, University of Technology Sydney, Australia 2La Trobe Business School, La Trobe University, Australia 3School of Computing Technologies, Royal Melbourne Institute of Technology, Australia 4School of Computing and Information Systems, Singapore Management University, Singapore. Correspondence to: Yunzhuang Shen <shenyunzhuang@outlook.com>.
Pseudocode Yes Algorithm 1 ASCG-ML Require: An initial set of MISs S, A ML prediction ˆy; 1: Initialize ϵ; 2: repeat 3: πϵ solve G-RDP(S, ˆy, ϵ); 4: c , s pricing w.r.t. πϵ; 5: S s S; 6: Update ϵ value; 7: until c 0 and ϵ = 0
Open Source Code Yes Our code is available at https://github.com/yunzhuangshen/ML-based-Adaptive-Stabilization.
Open Datasets Yes We consider two graph benchmarks: the Matilda library (Smith-Miles et al., 2014) and the Graph Coloring Benchmarks2 (GCB). We will use Matilda for training and use both benchmarks for testing.
Dataset Splits Yes Our training data are collected from 1258 Matilda graphs that cannot be solved by CG in 100 iterations. We randomly select 800 graphs for training 200 graphs for validation, and the remaining graphs for testing.
Hardware Specification Yes Our experiments are conducted on a cluster with 24 CPUs (Intel(R) Xeon(R) Gold 6238R CPU @ 2.20GHz).
Software Dependencies No G-RDP is solved by the LP solver in Gurobi (Gurobi Optimization, 2018) and the pricing problem is solved by a specialized exact solver, TSM (Jiang et al., 2018). We use the stochastic gradient descent method with Adam optimizer (Kingma & Ba, 2015) to train ML models. While software and their citations are mentioned, specific version numbers for Gurobi or TSM (beyond the year of citation) are not provided, nor are specific versions for ML libraries or the Adam optimizer.
Experiment Setup Yes We employ commonly used training procedures and hyperparameters to train a 3-layer FFNN and a 20-layer GCN. In particular, we note that an early stopping criterion and an L2 regularization are used for training to improve the generalization of ML methods. For FFNN, we set the number of neurons for the hidden layers to 32, and set the layer number l to 3, selected from l {2, 3, 5}. We set the parameters for GCN, following related studies (Li et al., 2018; Shen et al., 2021). The dimensionality of the weight matrices Θl is set to 32, and the layer number l is set to 20, selected from the l {5, 10, 20} layers. ... We use the stochastic gradient descent method with Adam optimizer (Kingma & Ba, 2015) to train ML models up to 1000 epochs. In an iteration, the model parameters are updated for each training problem instance with a learning rate of 0.0001.