Communication-Optimal Distributed Clustering

Authors: Jiecao Chen, He Sun, David Woodruff, Qin Zhang

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We implement our algorithms and demonstrate this phenomenon on real life datasets, showing that our algorithms are also very efficient in practice. 5 Experiments In this section we present experimental results for spectral graph clustering in the message passing and blackboard models.
Researcher Affiliation Collaboration Jiecao Chen Indiana University Bloomington, IN 47401 jiecchen@indiana.edu He Sun University of Bristol Bristol, BS8 1UB, UK h.sun@bristol.ac.uk David P. Woodruff IBM Research Almaden San Jose, CA 95120 dpwoodru@us.ibm.com Qin Zhang Indiana University Bloomington, IN 47401 qzhangcs@indiana.edu
Pseudocode No The paper describes algorithms in prose and mathematical expressions but does not include structured pseudocode blocks.
Open Source Code No The paper does not provide any explicit statements or links indicating that the source code for their described methodology is openly available.
Open Datasets No The paper describes the datasets (Twomoons, Gauss, Sculpture) in detail, but it does not provide specific links, DOIs, or citations with author/year information for public access to these datasets.
Dataset Splits No The paper describes the datasets used but does not specify training, validation, or test splits by percentage or absolute counts, nor does it refer to standard predefined splits.
Hardware Specification Yes Our experiments were conducted on an IBM Ne Xt Scale nx360 M4 server, which is equipped with 2 Intel Xeon E5-2652 v2 8-core processors, 32GB RAM and 250GB local storage.
Software Dependencies No We implemented the algorithms using multiple languages, including Matlab, Python and C++. The paper lists programming languages but does not provide specific version numbers for any software dependencies, libraries, or solvers.
Experiment Setup Yes We implemented the algorithms using multiple languages, including Matlab, Python and C++. Our experiments were conducted on an IBM Ne Xt Scale nx360 M4 server, which is equipped with 2 Intel Xeon E5-2652 v2 8-core processors, 32GB RAM and 250GB local storage. In the message passing model each site samples 5n edges; in the blackboard model all sites jointly sample 10n edges and the chain has length 18.