Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Distributionally Robust Optimization with Markovian Data

Authors: Mengmeng Li, Tobias Sutter, Daniel Kuhn

ICML 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments indicate that our approach has better computational and statistical properties than the state-of-the-art methods. [...] We now validate the out-of-sample guarantees of the proposed Markov DRO approach experimentally when the observable data is indeed generated by ergodic Markov chains and compare our method against three state-of-the-art approaches to data-driven decision making from the literature. [...] Figures 2 and 4 show that the proposed Markov DRO method outperforms the SAA scheme and the DRO approach by Van Parys et al. (2021) tailored to i.i.d. data (denoted as i.i.d. DRO ) in that its out-of-sample risk displays a smaller mean as well as a smaller variance.
Researcher Affiliation	Academia	Mengmeng Li 1 Tobias Sutter 1 Daniel Kuhn 1 [...] 1Risk Analytics and Optimization Chair, Ecole Polytechnique F ed erale de Lausanne. Correspondence to: Mengmeng Li <mengmeng.li@epﬂ.ch>.
Pseudocode	Yes	Algorithm 1 Frank-Wolfe algorithm for solving (7) [...] Algorithm 2 Solution of the oracle subproblem [...] The exact algorithm that we use to solve (4b) is outlined in Algorithm 3 in Appendix E.
Open Source Code	Yes	The Matlab code for reproducing all results is available from https://github.com/mkvdro/DRO_Markov.
Open Datasets	Yes	The second experiment is based on a marketing dataset from Kaggle,2 which tracks the purchasing behavior of 2,000 customers with respect to d = 5 brands of chocolates. [...] 2https://www.kaggle.com/khalidnasereddin/retail-dataset-analysis
Dataset Splits	No	The paper does not explicitly provide details about training/validation/test splits with percentages, sample counts, or references to predefined splits for the datasets used in experiments. It refers to "training sample sizes T" but not how a larger dataset is partitioned.
Hardware Specification	Yes	The problem instances are modelled in MATLAB, and all experiments are run on an Intel i5-5257U CPU (2.7GHz) computer with 16GB RAM.
Software Dependencies	No	The paper states that "The problem instances are modelled in MATLAB" but does not specify the version number of MATLAB or any other software dependencies with version numbers.
Experiment Setup	Yes	In the ﬁrst experiment we solve a random instance of the revenue maximization problem with n = 5 customer groups and d = 10 brands, where the weight vector w and the price vector a are sampled from the uniform distributions on n and {1, . . . , 10}d, respectively. [...] The second experiment is based on a marketing dataset from Kaggle [...] The customers are clustered into n = 5 groups based on their age, education level and income by using the K-means++ clustering algorithm by Vassilvitskii & Arthur (2006). [...] To this end, we solve 10 instances of the worst-case expectation problem (11) with rate parameter r = 1 for a ﬁxed decision x sampled from the uniform distribution on {0, 1}d. The 10 instances involve independent training sets of size T = 5,000, each of which is sampled from the same ﬁxed Markov chain constructed as in the experiments with synthetic data.