reproducibilityindex.ai

Reinforcement Learning for Route Optimization with Robustness Guarantees

Authors: Tobias Jacobs, Francesco Alesiani, Gulcin Ermis

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our approach by solving min-max versions of standard benchmarks for the Capacitated Vehicle Routing and the Traveling Salesperson Problem, where our agents obtain near-optimal solutions and improve upon the baselines.
Researcher Affiliation	Industry	Tobias Jacobs , Francesco Alesiani , Gülcin Ermis NEC Laboratories Europe Gmb H, Kurfürsten-Anlage 36, 69115 Heidelberg, Germany tobias.jacobs@neclab.eu , francesco.alesiani@neclab.eu , gulcin.ermis@neclab.eu
Pseudocode	Yes	Algorithm 1 contains a pseudo-code description of the environment behavior during training, which is determined by the set of training instances or initial states, the state transition function σ, the objective f, and the baseline policy π0.
Open Source Code	No	The paper mentions using existing open-source frameworks like Dopamine and OpenAI Gym, but does not state that the authors' own implementation code for their methodology is made publicly available.
Open Datasets	Yes	Our generation method corresponds to one of the generators used in the 8th DIMACS Implementation Challenge [Johnson and Mc Geoch, 2007]. Furthermore, for the CVRP problem we test the trained agents on benchmark instance sets provided by CVRPlib; see [Uchoa et al., 2017] for more information on the instances and their origin.
Dataset Splits	No	The paper mentions independent training and test sets, but does not explicitly specify a validation set or detailed dataset split percentages/counts for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments, only mentioning 'one hour of CPU runtime' for a baseline comparison.
Software Dependencies	No	The paper mentions software components like 'Dopamine framework', 'open AI gym', and a 'MIP solver', but does not provide specific version numbers for any of them.
Experiment Setup	Yes	The structure of QΘ follows the structure2vec architecture [Dai et al., 2016]... For performance reasons, the degree of the input graph is restricted to the 10 nearest neighbors of each node... In one variant we are masking infeasible actions using Equation 5. In the second variant each convolution layer is parameterized by individual weights... All instances of CVRP and TSP have been extended to robust (min-max) problems, where we deﬁned uncertain distances by setting α = 0.3 and β = 2.