reproducibilityindex.ai

Self-Guiding Exploration for Combinatorial Problems

Authors: Zangir Iklassov, Yali Du, Farkhad Akimov, Martin Takac

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present our research as the first to apply LLMs to a broad range of CPs and demonstrate that SGE outperforms existing prompting strategies by over 27.84% in CP optimization performance. Additionally, SGE achieves a 2.46% higher accuracy over the best existing results in other reasoning tasks (arithmetic, commonsense, and symbolic).
Researcher Affiliation	Academia	Zangir Iklassov MBZUAI zangir.iklassov@mbzuai.ac.ae Yali Du King s College London yali.du@kcl.ac.uk Farkhad Akimov MBZUAI farkhad.akimov@mbzuai.ac.ae Martin Takáˇc MBZUAI martin.takac@mbzuai.ac.ae
Pseudocode	Yes	Algorithm 1 Self-Guiding Exploration algorithm SGE( )
Open Source Code	Yes	Our implementation is available online.1 1https://github.com/Zangir/LLM-for-CP
Open Datasets	No	To facilitate these experiments, a dataset was created, comprising 100 randomly generated instances for each problem size. (No public access information for this created dataset is provided in the paper.)
Dataset Splits	No	The paper mentions 'train and test splits' for reasoning tasks, but does not provide specific percentages, counts, or explicit details for how data was partitioned into training, validation, or test sets for its own experiments (especially for the custom-generated CP datasets).
Hardware Specification	Yes	The experiments utilized an NVIDIA A100 SXM 40GB GPU, paired with two AMD EPYC 7742 CPUs (8 cores each) and 256GB RAM.
Software Dependencies	No	The paper mentions LLM models used (GPT-4, GPT-3.5, Gemini-1.5, Llama-2 series) and 'Google-OR-Tools solver', but does not provide specific version numbers for any of these software dependencies.
Experiment Setup	No	The paper describes problem instances, baselines, and evaluation metrics, but it does not specify concrete hyperparameters such as learning rate, batch size, or optimizer settings, nor detailed training configurations required for experiment replication.