On Metric DBSCAN with Low Doubling Dimension
Authors: Hu Ding, Fan Yang, Mingyue Wang
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results show that our algorithms can significantly outperform the existing DBSCAN algorithms in terms of running time. Finally, we compare the experimental performances of our algorithms and several well-known baseline DBSCAN algorithms on both synthetic and real datasets. |
| Researcher Affiliation | Academia | 1The School of Computer Science and Technology, University of Science and Technology of China huding@ustc.edu.cn, {yang208,mywang}@mail.ustc.edu.cn |
| Pseudocode | Yes | Algorithm 1 The Randomized Gonzalez’s algorithm; Algorithm 2 METRIC DBSCAN ALGORITHM |
| Open Source Code | No | The paper states that 'Our algorithms METRIC-1 and METRIC-2 are also implemented in C++', but does not provide any concrete access (link, explicit statement of release) to the source code for these implementations. |
| Open Datasets | Yes | NEURIPS [Perrone et al., 2017] contains n = 11463 word vectors of the full texts of the Neur IPS conference papers published in 1987-2015. USPSHW [Hull, 1994] contains n = 7291 16 16 pixel handwritten letter images. MNIST [Le Cun et al., 1998] contains n = 10000 handwritten digit images from 0 to 9, where each image is represented by a 784-dimensional vector. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing. |
| Hardware Specification | Yes | All the experimental results were obtained on a Windows 10 workstation equipped with an Intel core i5-8400 processor and 8GB RAM. |
| Software Dependencies | No | The paper states that their algorithms are 'implemented in C++', but it does not provide specific version numbers for the C++ compiler or any libraries used. |
| Experiment Setup | Yes | We set z = 200 (i.e., 1%n) and vary the ratio r/ in 0-0.5. Further, we set the value Min Pts = 1 1000n and 2 1000n for each dataset and show the running times in Figure 3. |