Crystallization Learning with the Delaunay Triangulation
Authors: Jiaqi Gu, Guosheng Yin
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on synthetic data under two different scenarios: (i) to illustrate the effectiveness of the crystallization learning in estimating the conditional expectation function µ( ); (ii) to evaluate the estimation accuracy of our approach in comparison with existing nonparametric regression methods, including the k-NN regression using the Euclidean distance, local linear regression using Gaussian kernel, multivariate kernel regression using Gaussian kernel (Hein, 2009) and Gaussian process models; and (iii) to validate the proposed data-driven procedure for selection of the hyperparameter L. We also apply our method to real data to investigate its empirical performance. |
| Researcher Affiliation | Academia | 1Department of Statistics and Actuarial Science, University of Hong Kong, Hong Kong SAR. |
| Pseudocode | Yes | Algorithm 1 DELAUNAYSPARSE (Chang et al., 2020) |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | Yes | We apply the crystallization learning to several real datasets from the UCI repository. The critical assessment of protein structure prediction (CASP) dataset2... The Concrete dataset3... Parkinson s telemonitoring dataset4... |
| Dataset Splits | Yes | For each dataset, we take 100 bootstrap samples without replacement of size n (n = 200, 500, 1000 or 2000) for training and 100 bootstrap samples of size 100 for testing. Similar to many other machine learning methods, the statistical complexity and estimation performance of the crystallization learning is controlled by the hyperparameter L, the maximal topological distance from the generated neighbor Delaunay simplices to S(z). Because a small L leads to overfitting and a large L makes ˆµ( ) overly smooth, we propose adopting the leave-one-out cross validation (LOO-CV) to select L with respect to the target point z as follows. |
| Hardware Specification | No | The paper provides runtime measurements in Table 1 but does not specify any particular hardware components like CPU or GPU models, or memory details used for the experiments. |
| Software Dependencies | No | The paper mentions various regression methods and the DELAUNAYSPARSE algorithm, but it does not specify any software names with version numbers (e.g., Python, PyTorch, scikit-learn versions) required for reproduction. |
| Experiment Setup | Yes | We implement the crystallization learning with L = 3 for d = 5, 10 and L = 2 for d = 20, 50, and obtain ˆµ(z1), . . . , ˆµ(z100). We implement the k-NN regression with k = 5, 10, k , where k equals the size of Vz,L, and the local linear regression and kernel regression with bandwidth h = 1. |