Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Kernel Smoothing, Mean Shift, and Their Learning Theory with Directional Data
Authors: Yikun Zhang, Yen-Chi Chen
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate the applicability of the algorithm, we evaluate it as a mode clustering method on both simulated and real-world data sets. [...] Simulation studies and applications to real-world data sets are unfolded in Section 6. |
| Researcher Affiliation | Academia | Yikun Zhang EMAIL Yen-Chi Chen EMAIL Department of Statistics University of Washington Seattle, WA 98195, USA |
| Pseudocode | Yes | Algorithm 1 Mean Shift Algorithm with Directional Data |
| Open Source Code | Yes | All the code for our experiments is available at https://github.com/zhangyk8/Dir MS. |
| Open Datasets | Yes | Martian crater data are publicly available on the Gazetteer of Planetary Nomenclature database (https://planetarynames.wr.usgs.gov/Advanced Search) of the International Astronomical Union (IUA). [...] The earthquake data can be obtained from the Earthquake Catalog (https://earthquake.usgs.gov/earthquakes/search/) of the United States Geological Survey. |
| Dataset Splits | No | The paper describes generating '1000 data points' or using datasets with specified total counts (e.g., '1653 craters', '1666 earthquakes') for mode clustering, but does not provide specific training/test/validation splits. Mode clustering is generally applied to the entire dataset. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Matplotlib Basemap Toolkit' but does not specify a version number. Other software dependencies or their versions are not provided. |
| Experiment Setup | Yes | Unless stated otherwise, we use the von Mises kernel L(r) = e r in the directional KDE (2) to estimate the directional densities and their derivatives. [...] the default bandwidth parameter is selected via the rule of thumb in Proposition 2 in Garc ıa-Portugu es (2013) [...]. The estimated concentration parameter bν is given by (4.4) in Banerjee et al. (2005) [...]. In addition, the tolerance level for terminating the algorithm is set to ϵ = 10 7. |