reproducibilityindex.ai

Chain-of-Experts: When LLMs Meet Complex Operations Research Problems

Authors: Ziyang Xiao, Dongxiang Zhang, Yangjun Wu, Lilin Xu, Yuan Jessica Wang, Xiongwei Han, Xiaojin Fu, Tao Zhong, Jia Zeng, Mingli Song, Gang Chen

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that Co E significantly outperforms the state-of-the-art LLM-based approaches both on LPWP and Complex OR.
Researcher Affiliation	Collaboration	1 Zhejiang University 2 Huawei Noah s Ark Lab
Pseudocode	Yes	Algorithm 1 provides the implementation pseudo-code of the Chain-of-Expert framework, which consists of four main stages:
Open Source Code	Yes	The experimental code is at https://github.com/xzymustbexzy/Chain-of-Experts.
Open Datasets	Yes	LPWP. The LPWP dataset (Ramamonjison et al., 2022b) is collected from the NL4Opt competition in Nuer IPS 2022... A benchmark dataset1 was curated by Ramamonjison et al. (2022a).1https://github.com/nl4opt/nl4opt-competition. Complex OR. With the assistance from three specialists with expertise in operations research, we constructed and released the first dataset for complex OR problems.
Dataset Splits	Yes	The dataset is partitioned into 713 training samples, 99 validation samples, and 289 test samples for performance evaluation.
Hardware Specification	No	The paper does not specify the hardware (e.g., CPU/GPU models, memory) used for running the experiments. It only mentions the LLMs used (GPT-3.5-turbo, GPT-4, Claude2).
Software Dependencies	No	The paper mentions using GPT-3.5-turbo, GPT-4, Claude2, Gurobi, NumPy, SciPy, and PuLP, but does not provide specific version numbers for these software dependencies, which is necessary for reproducibility.
Experiment Setup	Yes	We set the parameter temperature to a value of 0.7 and conduct five runs to average the metrics. The number of iterations is set to 3, with each iteration consisting of 5 forward steps by default.