reproducibilityindex.ai

Crystallization Learning with the Delaunay Triangulation

Authors: Jiaqi Gu, Guosheng Yin

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on synthetic data under two different scenarios: (i) to illustrate the effectiveness of the crystallization learning in estimating the conditional expectation function µ( ); (ii) to evaluate the estimation accuracy of our approach in comparison with existing nonparametric regression methods, including the k-NN regression using the Euclidean distance, local linear regression using Gaussian kernel, multivariate kernel regression using Gaussian kernel (Hein, 2009) and Gaussian process models; and (iii) to validate the proposed data-driven procedure for selection of the hyperparameter L. We also apply our method to real data to investigate its empirical performance.
Researcher Affiliation	Academia	1Department of Statistics and Actuarial Science, University of Hong Kong, Hong Kong SAR.
Pseudocode	Yes	Algorithm 1 DELAUNAYSPARSE (Chang et al., 2020)
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology.
Open Datasets	Yes	We apply the crystallization learning to several real datasets from the UCI repository. The critical assessment of protein structure prediction (CASP) dataset2... The Concrete dataset3... Parkinson s telemonitoring dataset4...
Dataset Splits	Yes	For each dataset, we take 100 bootstrap samples without replacement of size n (n = 200, 500, 1000 or 2000) for training and 100 bootstrap samples of size 100 for testing. Similar to many other machine learning methods, the statistical complexity and estimation performance of the crystallization learning is controlled by the hyperparameter L, the maximal topological distance from the generated neighbor Delaunay simplices to S(z). Because a small L leads to overﬁtting and a large L makes ˆµ( ) overly smooth, we propose adopting the leave-one-out cross validation (LOO-CV) to select L with respect to the target point z as follows.
Hardware Specification	No	The paper provides runtime measurements in Table 1 but does not specify any particular hardware components like CPU or GPU models, or memory details used for the experiments.
Software Dependencies	No	The paper mentions various regression methods and the DELAUNAYSPARSE algorithm, but it does not specify any software names with version numbers (e.g., Python, PyTorch, scikit-learn versions) required for reproduction.
Experiment Setup	Yes	We implement the crystallization learning with L = 3 for d = 5, 10 and L = 2 for d = 20, 50, and obtain ˆµ(z1), . . . , ˆµ(z100). We implement the k-NN regression with k = 5, 10, k , where k equals the size of Vz,L, and the local linear regression and kernel regression with bandwidth h = 1.