A Practical Algorithm for Distributed Clustering and Outlier Detection
Authors: Jiecao Chen, Erfan Sadeqi Azer, Qin Zhang
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on both real and synthetic data have demonstrated the clear superiority of our algorithm against all the baseline algorithms in almost all metrics. |
| Researcher Affiliation | Academia | Jiecao Chen Indiana University Bloomington Bloomington, IN jiecchen@indiana.edu Erfan Sadeqi Azer Indiana University Bloomington Bloomington, IN esadeqia@indiana.edu Qin Zhang Indiana University Bloomington Bloomington, IN qzhangcs@indiana.edu |
| Pseudocode | Yes | Algorithm 1: Summary-Outliers(X, k, t) |
| Open Source Code | No | The paper does not provide a direct link or explicit statement about the public availability of the source code for the described methodology. |
| Open Datasets | Yes | kdd Full. This dataset is from 1999 kddcup competition and contains instances describing connections of sequences of tcp packets. |
| Dataset Splits | No | The paper mentions data is 'randomly partitioned among the sites' but does not provide specific percentages or counts for training, validation, or test splits. It does not mention a 'validation' set specifically. |
| Hardware Specification | Yes | All experiments are conducted in a Power Edge R730 server equipped with 2 x Intel Xeon E5-2667 v3 3.2GHz. This server has 8-core/16-thread per CPU, 192GB Memeory and 1.6TB SSD. |
| Software Dependencies | No | The paper mentions 'C++ with Boost.MPI support' and 'Armadillo Sanderson (2010) as the numerical linear library' but does not specify version numbers for these software dependencies. |
| Experiment Setup | Yes | We fix α = 2 and β = 4.5 in the subroutine Algorithm 1. ... k = 3, t = 8752 for kdd Sp and t = 45747 for kdd Full |