Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning-Augmented Algorithms for $k$-median via Online Learning

Authors: Anish Hebbar, Rong Ge, Amit Kumar, Debmalya Panigrahi

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental Evaluation. We perform experiments to validate our theoretical results empirically. We first run our algorithms on synthetic data generated from simple distributions such as a uniform distribution on a square or on a set of distinct clusters. In these cases, we confirm that the algorithms quickly converge to the optimal solution, and the average performance is nearly optimal in long run, thereby significantly outperforming theoretical bounds. In our second set of experiments, we use a sequence of dynamically changing distributions with the goal of understanding the responsiveness of our algorithm to changing data. We observe that as the algorithm discovers new data sets corresponding to new regions of the metric space, it reacts quickly to the changing environment and changes the location of k centers, in some cases even outperforms the static best solution in hindsight.
Researcher Affiliation Academia Anish Hebbar Rong Ge Duke University Duke University EMAIL EMAIL Amit Kumar Debmalya Panigrahi IIT Delhi Duke University EMAIL EMAIL
Pseudocode Yes Algorithm 1: Algorithm for an instance of LEARN-BOUNDED-MEDIAN
Open Source Code Yes The code used for the experiments is available at https://github.com/neurips2025-colab/neurips2025, along with instructions.
Open Datasets No We first run our algorithms on synthetic data generated from simple distributions such as a uniform distribution on a square or on a set of distinct clusters. In these cases, we confirm that the algorithms quickly converge to the optimal solution, and the average performance is nearly optimal in long run, thereby significantly outperforming theoretical bounds. In our second set of experiments, we use a sequence of dynamically changing distributions with the goal of understanding the responsiveness of our algorithm to changing data.
Dataset Splits No In Uniform Square, each round has a set of points sampled uniformly from the unit square [0, 1] [0, 1]. 10 points arrive each round, chosen uniformly at random, and we run the experiment with a time horizon of T = 1000.
Hardware Specification Yes Unless mentioned otherwise, experiments were conducted using Google Colab (code available at https://github.com/neurips2025-colab/neurips2025), using four Intel Xeon CPUs (2.20 GHz) with 13 GB of RAM each, running in parallel over about 24 compute hours.
Software Dependencies No We compute optimal-in-hindsight solutions for the instances using Gurobi s ILP solver, accessed under an academic license.
Experiment Setup Yes In all our experiments, we use a heuristic to improve the performance of the rounding algorithms, where we do a binary search over the best threshold on the distance of centers to open (this value was (2k + 2)D(yt, i) for deterministic rounding and 4D(yt, i) for randomized rounding). We set this threshold to the smallest possible that still gives the desired number of centers k. We report the main experimental findings in this section, and give additional results and experiments in the appendix.