reproducibilityindex.ai

A Practical Algorithm for Distributed Clustering and Outlier Detection

Authors: Jiecao Chen, Erfan Sadeqi Azer, Qin Zhang

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments on both real and synthetic data have demonstrated the clear superiority of our algorithm against all the baseline algorithms in almost all metrics.
Researcher Affiliation	Academia	Jiecao Chen Indiana University Bloomington Bloomington, IN jiecchen@indiana.edu Erfan Sadeqi Azer Indiana University Bloomington Bloomington, IN esadeqia@indiana.edu Qin Zhang Indiana University Bloomington Bloomington, IN qzhangcs@indiana.edu
Pseudocode	Yes	Algorithm 1: Summary-Outliers(X, k, t)
Open Source Code	No	The paper does not provide a direct link or explicit statement about the public availability of the source code for the described methodology.
Open Datasets	Yes	kdd Full. This dataset is from 1999 kddcup competition and contains instances describing connections of sequences of tcp packets.
Dataset Splits	No	The paper mentions data is 'randomly partitioned among the sites' but does not provide specific percentages or counts for training, validation, or test splits. It does not mention a 'validation' set specifically.
Hardware Specification	Yes	All experiments are conducted in a Power Edge R730 server equipped with 2 x Intel Xeon E5-2667 v3 3.2GHz. This server has 8-core/16-thread per CPU, 192GB Memeory and 1.6TB SSD.
Software Dependencies	No	The paper mentions 'C++ with Boost.MPI support' and 'Armadillo Sanderson (2010) as the numerical linear library' but does not specify version numbers for these software dependencies.
Experiment Setup	Yes	We ﬁx α = 2 and β = 4.5 in the subroutine Algorithm 1. ... k = 3, t = 8752 for kdd Sp and t = 45747 for kdd Full