OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models

Authors: Ali Ahmaditeshnizi, Wenzhi Gao, Madeleine Udell

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that Opti MUS outperforms existing state-of-the-art methods on easy datasets by more than 20% and on hard datasets (including a new dataset, NLP4LP, released with this paper that features long and complex problems) by more than 30%.
Researcher Affiliation Academia 1Department of Management Science and Engineering, Stanford University, CA, USA 2Institute for Computational and Mathematical Engineering, Stanford University, CA, USA.
Pseudocode Yes Algorithm 1 Workflow of Opti MUS
Open Source Code Yes The implementation and the datasets are available at https://github.com/teshnizi/Opti MUS.
Open Datasets Yes The implementation and the datasets are available at https://github.com/teshnizi/Opti MUS. This work introduces NLP4LP, an open source dataset of 67 complex optimization problems. NL4OPT. This dataset is a collection of 1101 easy linear programming problems proposed as part of the NL4OPT competition (Ramamonjison et al, 2023). Complex OR. Complex OR is a collection of 37 complex operations research problems in a variety of application domains (Xiao et al., 2024).
Dataset Splits No The paper describes the datasets used (NL4OPT, Complex OR, NLP4LP) as collections of problems or benchmarks, but does not specify how these datasets are explicitly split into training, validation, and test sets for the purpose of the experiments conducted in the paper. It evaluates on "instances" from these datasets.
Hardware Specification No The paper does not explicitly describe the hardware (e.g., specific CPU, GPU models, or memory) used to run the experiments. It mentions the LLM models used (GPT-4, GPT-3.5, Mixtral-8x7B) and the solver (Gurobi) but not the underlying hardware.
Software Dependencies No The paper states that 'the programmer uses Python as the programming language and Gurobi as the solver,' but it does not provide specific version numbers for Python, Gurobi, or any other software dependencies.
Experiment Setup No The paper describes the system's modular architecture and agent interactions but does not provide specific experimental setup details such as concrete hyperparameter values for LLM API calls (e.g., temperature, top_p) or detailed system-level training configurations beyond discussing the maximum number of agent calls in a sensitivity analysis.