Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Online Learning in the Repeated Mediated Newsvendor Problem

Authors: Nataša Bolić, Tom Cesari, Roberto Colomboni, Christian Paravalos

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we empirically validate our theoretical results by simulating a variety of supplier cost distributions and retailer utility functions. Our goal is to illustrate the practical effectiveness of our algorithm and the necessity of Assumptions 1.1 1.4.
Researcher Affiliation Academia Nataša Boli c1 Tommaso Cesari1 Roberto Colomboni2,3 Christian Paravalos1 1EECS, University of Ottawa, Ottawa, Canada 2DEIB, Politecnico di Milano, Milano, Italy 3Department of CS, Università degli Studi di Milano, Milano, Italy
Pseudocode Yes We present an algorithm (Algorithm 1) for the repeated mediated newsvendor problem and prove that, under Assumptions 1.1 1.4, the algorithm achieves a e O(T 2/3) regret (Theorem 3.3). ... Algorithm 1 input: Time horizon T N, discretization parameter K N, and confidence parameter δ (0, 1) init: Set j {0, . . . , K}, pj := j K , Aj(0) := Bj(0) := Nj(0) := 0 1: for each time t = 1, 2, . . . do 2: if t K2 then select It := t 1 (mod K) 3: if t > K2 then ...
Open Source Code Yes Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We provide the code and explain how to run it.
Open Datasets No To validate our theoretical upper bound, we evaluate the regret of Algorithm 1 across several supplier cost distributions designed to represent realistic mediated market conditions. Firstly, we use the Uniform(0,1) distribution as a neutral baseline for our experiments. Then, to model markets where low-cost suppliers are prevalent but occasional high-cost suppliers exist, we consider two rightskewed distributions: a Beta(α = 2, β = 5) and a Log-Normal(µ = 0.5, σ = 1) truncated to [0, 1], with the latter assigning more probability to high-cost suppliers. Finally, to represent markets with two distinct tiers of suppliers, we include a bimodal mixture 0.75 Beta(2, 5) + 0.25 Beta(5, 2), which creates a majority group of low-cost suppliers and a smaller, higher-cost group. We also consider two realistic and prevalent families of retailer utility functions.
Dataset Splits No The paper describes distributions for generating synthetic data for simulations rather than using pre-existing datasets with defined splits.
Hardware Specification Yes For reference, we ran our experiments on a Mac Book Pro with an M1 Pro chip (10-core CPU, 16-core GPU) and 16 GB of RAM.
Software Dependencies No The paper does not explicitly list any specific software dependencies with version numbers used for the experiments.
Experiment Setup Yes To validate our theoretical upper bound, we evaluate the regret of Algorithm 1 across several supplier cost distributions designed to represent realistic mediated market conditions. Firstly, we use the Uniform(0,1) distribution as a neutral baseline for our experiments. Then, to model markets where low-cost suppliers are prevalent but occasional high-cost suppliers exist, we consider two rightskewed distributions: a Beta(α = 2, β = 5) and a Log-Normal(µ = 0.5, σ = 1) truncated to [0, 1], with the latter assigning more probability to high-cost suppliers. Finally, to represent markets with two distinct tiers of suppliers, we include a bimodal mixture 0.75 Beta(2, 5) + 0.25 Beta(5, 2), which creates a majority group of low-cost suppliers and a smaller, higher-cost group. We also consider two realistic and prevalent families of retailer utility functions. Firstly, we model satiable demand through the capped-linear utility function Ua, q(q) = min{aq, a q}, where the retailer s utility grows at a constant marginal rate a [0, 1] until the quantity purchased reaches the saturation threshold q [0, 1], after which purchasing additional goods generates no further benefit. For our experiments, we draw a from a Beta(5, 2) distribution and q from a Beta(2,2) distribution. Secondly, we model diminishing marginal returns using the exponential utility function Uλ(q) = 1 e λq /λ, where λ > 0 determines the overall valuation level. For our experiments, we draw λ from a Log-Normal(0, 0.5) distribution.