Self-Guiding Exploration for Combinatorial Problems

Authors: Zangir Iklassov, Yali Du, Farkhad Akimov, Martin Takac

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present our research as the first to apply LLMs to a broad range of CPs and demonstrate that SGE outperforms existing prompting strategies by over 27.84% in CP optimization performance. Additionally, SGE achieves a 2.46% higher accuracy over the best existing results in other reasoning tasks (arithmetic, commonsense, and symbolic).
Researcher Affiliation Academia Zangir Iklassov MBZUAI zangir.iklassov@mbzuai.ac.ae Yali Du King s College London yali.du@kcl.ac.uk Farkhad Akimov MBZUAI farkhad.akimov@mbzuai.ac.ae Martin Takáˇc MBZUAI martin.takac@mbzuai.ac.ae
Pseudocode Yes Algorithm 1 Self-Guiding Exploration algorithm SGE( )
Open Source Code Yes Our implementation is available online.1 1https://github.com/Zangir/LLM-for-CP
Open Datasets No To facilitate these experiments, a dataset was created, comprising 100 randomly generated instances for each problem size. (No public access information for this created dataset is provided in the paper.)
Dataset Splits No The paper mentions 'train and test splits' for reasoning tasks, but does not provide specific percentages, counts, or explicit details for how data was partitioned into training, validation, or test sets for its own experiments (especially for the custom-generated CP datasets).
Hardware Specification Yes The experiments utilized an NVIDIA A100 SXM 40GB GPU, paired with two AMD EPYC 7742 CPUs (8 cores each) and 256GB RAM.
Software Dependencies No The paper mentions LLM models used (GPT-4, GPT-3.5, Gemini-1.5, Llama-2 series) and 'Google-OR-Tools solver', but does not provide specific version numbers for any of these software dependencies.
Experiment Setup No The paper describes problem instances, baselines, and evaluation metrics, but it does not specify concrete hyperparameters such as learning rate, batch size, or optimizer settings, nor detailed training configurations required for experiment replication.