Revisiting Sampling for Combinatorial Optimization
Authors: Haoran Sun, Katayoon Goshvadi, Azade Nova, Dale Schuurmans, Hanjun Dai
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we revisit the idea of using sampling for combinatorial optimization, motivated by the significant recent advances of gradient-based discrete MCMC and new techniques for parallel neighborhood exploration on accelerators. Remarkably, we find that modern sampling strategies can leverage landscape information to provide general-purpose solvers that require no training and yet are competitive with state of the art combinatorial solvers. In particular, experiments on cover vertex selection, graph partition and routing demonstrate better speed-quality trade-offs over current learning based approaches, and sometimes even superior performance to commercial solvers and specialized algorithms. |
| Researcher Affiliation | Collaboration | Haoran Sun * 1 Katayoon Goshvadi 2 Azade Nova 2 Dale Schuurmans 2 Hanjun Dai 2 1Georgia Tech 2Google Deepmind. |
| Pseudocode | Yes | Algorithm 1 Sampling for Combinatorial Optimization |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the methodology described. |
| Open Datasets | Yes | We use the MIS benchmark from the recent work (Qiu et al., 2022), which consists of graphs from SATLIB (Hoos & St utzle, 2000) and also Erd os R enyi (ER) random graphs (Erd os et al., 1960) of different sizes. |
| Dataset Splits | No | The paper frequently mentions "test dataset" but does not specify the explicit training, validation, and test dataset splits with percentages, sample counts, or citations to predefined split methodologies. |
| Hardware Specification | Yes | We find the runtime can be easily improved with more powerful GPUs that have more cores, or simply by adding more GPUs and running multiple Markov chains in parallel. We run i SCO with different initial temperatures, and use an exponential temperature decay schedule by default. Since more steps or more Markov chains for i SCO would always result in better solution quality, we halt the sampling procedure when the solution quality is sufficient or the gain plateaus. In Section 5.4 we provide an ablation study to isolate the impact of different hyper-parameters, and to also show the efficiency gains over classical samplers. Setup: By default we approximate the evaluation of the energy functions through the first-order Taylor expansion, use the variant of the PAFS sampler in Section 3.2 to simulate the Langevin dynamics, and report the runtime of i SCO on a machine with a single Nvidia 1080Ti GPU (unless otherwise noted), in alignment with existing data-driven approaches. |
| Software Dependencies | No | The paper mentions ML compilers like XLA or frameworks like JAX but does not specify particular version numbers for any software dependencies. |
| Experiment Setup | Yes | We run i SCO with different initial temperatures, and use an exponential temperature decay schedule by default. Since more steps or more Markov chains for i SCO would always result in better solution quality, we halt the sampling procedure when the solution quality is sufficient or the gain plateaus. |