Fully Dynamic $k$-Clustering in $\tilde O(k)$ Update Time

Authors: Sayan Bhattacharya, Martín Costa, Silvio Lattanzi, Nikos Parotsidis

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We complement our theoretical analysis with the first in-depth experimental study for the dynamic k-median problem on general metrics, focusing on comparing our dynamic algorithm to the current state-of-the-art by Henzinger and Kale [20]. ... We supplement the above theorem with experiments that compare our algorithm with that of [20]. To the best of our knowledge, this is the first work in the dynamic clustering literature with a detailed experimental evaluation of dynamic k-median algorithms for general metrics.
Researcher Affiliation Collaboration Sayan Bhattacharya University of Warwick s.bhattacharya@warwick.ac.uk Martin Costa University of Warwick martin.costa@warwick.ac.uk Silvio Lattanzi Google Research silviol@google.com Nikos Parotsidis Google Research nikosp@google.com
Pseudocode Yes Algorithm 1 Static Algo(U), Algorithm 2 Almost Cover(U), Algorithm 3 Preprocess(U), Algorithm 4 Construct From Layer(i), Algorithm 5 Insert(x), Algorithm 6 Delete(x), Algorithm 7 Rebuild
Open Source Code Yes All of our code is written in Java and is available online.4 [footnote 4: https://github.com/martin-costa/NeurIPS23-dynamic-k-clustering]
Open Datasets Yes Datasets. We conduct the empirical evaluation of our algorithm on five datasets from the UCI repository [17]: (1) Census [23] with 2, 458, 285 points of dimension 68, (2) KDD-Cup [32] containing 311, 029 points of dimension 74, (3) Song [5] with 515, 345 points of dimension 90, (4) Drift [33, 30] with 13, 910 points of dimension 129, and (5) SIFT10M [17] with 11, 164, 866 points of dimension 128.
Dataset Splits No The paper describes a sliding window approach for generating dynamic instances but does not specify a traditional train/validation/test dataset split.
Hardware Specification Yes We used a machine with 8 cores, a 2.9Ghz processor, and 16 Gi B of main memory.
Software Dependencies No The paper states 'All of our code is written in Java' but does not specify the version of Java or any other software dependencies with their version numbers.
Experiment Setup Yes OURALG has a parameter ϕ which we set to be the number of points sampled during the construction of a layer in Algorithm 2. ... We fix the constants ϵ and β used by our algorithm (see Section 3) to 0.2 and 0.5 respectively. HK has a parameter ψ which we set to be the number of points sampled during the static coreset construction from [9]... For each of ϕ and ψ, we experimented with the values 250, 500, 1000. ... We compare OURALG(ϕ = 500) against HK(ψ = 1000).