GCOMB: Learning Budget-constrained Combinatorial Algorithms over Billion-sized Graphs

Authors: Sahil Manchanda, AKASH MITTAL, Anuj Dhawan, Sourav Medya, Sayan Ranu, Ambuj Singh

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform extensive experiments on real graphs to benchmark the efficiency and efficacy of GCOMB. Our results establish that GCOMB is 100 times faster and marginally better in quality than state-of-the-art algorithms for learning combinatorial algorithms. Additionally, a case-study on the practical combinatorial problem of Influence Maximization (IM) shows GCOMB is 150 times faster than the specialized IM algorithm IMM with similar quality.
Researcher Affiliation Academia Sahil Manchanda, Akash Mittal , Anuj Dhawan Indian Institute of Technology Delhi {sahil.manchanda,cs1150208,Anuj.Dhawan.cs115}@cse.iitd.ac.in Sourav Medya1, Sayan Ranu2, Ambuj Singh3 1Northwestern University,2Indian Institute of Technology Delhi 3University of California Santa Barbara 1sourav.medya@kellogg.northwestern.edu 2sayanranu@cse.iitd.ac.in,3ambuj@ucsb.edu
Pseudocode Yes Probabilistic greedy: To address the above issue, we sample from the solution space in a greedy manner and learn embeddings that reflect the marginal gain f(S {v}) f(S) provided by a node v towards the solution set S (Alg. 2 in Appendix). Specifically, for each good node v, we want to learn embeddings that can predict score(v) through a surrogate function score (v). Towards that end, we draw multiple samples of training graphs and budgets, and the parameters are learned by minimizing the mean squared error loss (See Alg.3 for detailed pseudocode in the Supplementary). The pseudocode with more details is provided in the Supplementary (App. C).
Open Source Code Yes The source code can be found at https://github.com/idea-iitd/GCOMB .
Open Datasets Yes Table 1a) lists the real datasets used for our experiments. Random Bipartite Graphs (BP): We also use the synthetic random bipartite graphs from S2V-DQN [7]. Datasets from SNAP repository [18].
Dataset Splits Yes Train-Validation-Test split: While testing on any synthetic BP graph, we train and validate on five BP-1k graphs each. For real graphs, we train and validate on Bright Kite (BK) (50 : 50 split for train and validate) and test on other real networks.
Hardware Specification Yes All experiments are performed on a machine running Intel Xeon E5-2698v4 processor with 64 cores, having 1 Nvidia 1080 Ti GPU card with 12GB GPU memory, and 256 GB RAM with Ubuntu 16.04.
Software Dependencies Yes All experiments are performed on a machine running Intel Xeon E5-2698v4 processor with 64 cores, having 1 Nvidia 1080 Ti GPU card with 12GB GPU memory, and 256 GB RAM with Ubuntu 16.04. IBM. Cplex 12.9, 2019.
Experiment Setup Yes Training: In all our experiments, for a fair comparison of GCOMB with S2V-DQN and GCNTREESEARCH, we train all models for 12 hours and the best performing model on the validation set is used for inference. In IMM, we set ϵ = 0.5 as suggested by the authors. In OPIM, ϵ is recommended to be kept in range [0.01, 0.1]. Thus, we set it to ϵ = 0.05.