reproducibilityindex.ai

On Metric DBSCAN with Low Doubling Dimension

Authors: Hu Ding, Fan Yang, Mingyue Wang

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results show that our algorithms can signiﬁcantly outperform the existing DBSCAN algorithms in terms of running time. Finally, we compare the experimental performances of our algorithms and several well-known baseline DBSCAN algorithms on both synthetic and real datasets.
Researcher Affiliation	Academia	1The School of Computer Science and Technology, University of Science and Technology of China huding@ustc.edu.cn, {yang208,mywang}@mail.ustc.edu.cn
Pseudocode	Yes	Algorithm 1 The Randomized Gonzalez’s algorithm; Algorithm 2 METRIC DBSCAN ALGORITHM
Open Source Code	No	The paper states that 'Our algorithms METRIC-1 and METRIC-2 are also implemented in C++', but does not provide any concrete access (link, explicit statement of release) to the source code for these implementations.
Open Datasets	Yes	NEURIPS [Perrone et al., 2017] contains n = 11463 word vectors of the full texts of the Neur IPS conference papers published in 1987-2015. USPSHW [Hull, 1994] contains n = 7291 16 16 pixel handwritten letter images. MNIST [Le Cun et al., 1998] contains n = 10000 handwritten digit images from 0 to 9, where each image is represented by a 784-dimensional vector.
Dataset Splits	No	The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing.
Hardware Specification	Yes	All the experimental results were obtained on a Windows 10 workstation equipped with an Intel core i5-8400 processor and 8GB RAM.
Software Dependencies	No	The paper states that their algorithms are 'implemented in C++', but it does not provide specific version numbers for the C++ compiler or any libraries used.
Experiment Setup	Yes	We set z = 200 (i.e., 1%n) and vary the ratio r/ in 0-0.5. Further, we set the value Min Pts = 1 1000n and 2 1000n for each dataset and show the running times in Figure 3.