Approximate Stein Classes for Truncated Density Estimation
Authors: Daniel James Williams, Song Liu
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To show the validity of our proposed method, we experiment on benchmark settings against Trunc SM and an adaptation of bd-KSD for truncated density estimation. We also provide additional empirical experiments in the appendices; empirical consistency (Appendix B.3), a demonstration on the Gaussian mixture distribution (Appendix B.4), an implementation for truncated regression (Appendix B.5) and a investigation into the effect of the distribution of the boundary points (Appendix B.6). |
| Researcher Affiliation | Academia | Daniel J. Williams 1 Song Liu 1 1School of Mathematics, University of Bristol, UK. Correspondence to: Daniel J. Williams <daniel.williams@bristol.ac.uk>. |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | All results in this paper can be reproduced using the Git Hub repository located at https://github.com/ dannyjameswilliams/tksd. |
| Open Datasets | Yes | We also experiment on a real-world dataset given by UCLA: Statistical Consulting Group (Example 1). This dataset contains student test scores in a school for which the acceptance threshold is 40 out of 100, and therefore the response variable (the test scores) are truncated below by 40 and above by 100. |
| Dataset Splits | No | The paper describes data simulation and truncation to acquire a certain number of data points for estimation, but it does not specify any training, validation, or test dataset splits or cross-validation setups. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions implementation details like using 'Python' and methods from 'Jitkrittum et al. (2017) for fast computation', but it does not provide specific version numbers for Python or any libraries/frameworks (e.g., PyTorch, TensorFlow, NumPy, SciPy, scikit-learn). |
| Experiment Setup | Yes | TKSD requires the selection of hyperparameters: the number of boundary points, m, the choice of kernel function, k, and the corresponding kernel hyperparameters. For this work, we focus on the Gaussian kernel, k(x, y) = exp{ (2σ2) 1 x y 2}, and the bandwidth parameter σ is chosen heuristically as the median of pairwise distances on the data matrix. In choosing m, there is a trade-off between accuracy and computational expense, since K in (24) is an m m matrix which requires inversion. In experiments, we let m scale with d2. |