Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning-Augmented Algorithms for $k$-median via Online Learning
Authors: Anish Hebbar, Rong Ge, Amit Kumar, Debmalya Panigrahi
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental Evaluation. We perform experiments to validate our theoretical results empirically. We first run our algorithms on synthetic data generated from simple distributions such as a uniform distribution on a square or on a set of distinct clusters. In these cases, we confirm that the algorithms quickly converge to the optimal solution, and the average performance is nearly optimal in long run, thereby significantly outperforming theoretical bounds. In our second set of experiments, we use a sequence of dynamically changing distributions with the goal of understanding the responsiveness of our algorithm to changing data. We observe that as the algorithm discovers new data sets corresponding to new regions of the metric space, it reacts quickly to the changing environment and changes the location of k centers, in some cases even outperforms the static best solution in hindsight. |
| Researcher Affiliation | Academia | Anish Hebbar Rong Ge Duke University Duke University EMAIL EMAIL Amit Kumar Debmalya Panigrahi IIT Delhi Duke University EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1: Algorithm for an instance of LEARN-BOUNDED-MEDIAN |
| Open Source Code | Yes | The code used for the experiments is available at https://github.com/neurips2025-colab/neurips2025, along with instructions. |
| Open Datasets | No | We first run our algorithms on synthetic data generated from simple distributions such as a uniform distribution on a square or on a set of distinct clusters. In these cases, we confirm that the algorithms quickly converge to the optimal solution, and the average performance is nearly optimal in long run, thereby significantly outperforming theoretical bounds. In our second set of experiments, we use a sequence of dynamically changing distributions with the goal of understanding the responsiveness of our algorithm to changing data. |
| Dataset Splits | No | In Uniform Square, each round has a set of points sampled uniformly from the unit square [0, 1] [0, 1]. 10 points arrive each round, chosen uniformly at random, and we run the experiment with a time horizon of T = 1000. |
| Hardware Specification | Yes | Unless mentioned otherwise, experiments were conducted using Google Colab (code available at https://github.com/neurips2025-colab/neurips2025), using four Intel Xeon CPUs (2.20 GHz) with 13 GB of RAM each, running in parallel over about 24 compute hours. |
| Software Dependencies | No | We compute optimal-in-hindsight solutions for the instances using Gurobi s ILP solver, accessed under an academic license. |
| Experiment Setup | Yes | In all our experiments, we use a heuristic to improve the performance of the rounding algorithms, where we do a binary search over the best threshold on the distance of centers to open (this value was (2k + 2)D(yt, i) for deterministic rounding and 4D(yt, i) for randomized rounding). We set this threshold to the smallest possible that still gives the desired number of centers k. We report the main experimental findings in this section, and give additional results and experiments in the appendix. |