Approximate Stein Classes for Truncated Density Estimation

Authors: Daniel James Williams, Song Liu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To show the validity of our proposed method, we experiment on benchmark settings against Trunc SM and an adaptation of bd-KSD for truncated density estimation. We also provide additional empirical experiments in the appendices; empirical consistency (Appendix B.3), a demonstration on the Gaussian mixture distribution (Appendix B.4), an implementation for truncated regression (Appendix B.5) and a investigation into the effect of the distribution of the boundary points (Appendix B.6).
Researcher Affiliation Academia Daniel J. Williams 1 Song Liu 1 1School of Mathematics, University of Bristol, UK. Correspondence to: Daniel J. Williams <daniel.williams@bristol.ac.uk>.
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes All results in this paper can be reproduced using the Git Hub repository located at https://github.com/ dannyjameswilliams/tksd.
Open Datasets Yes We also experiment on a real-world dataset given by UCLA: Statistical Consulting Group (Example 1). This dataset contains student test scores in a school for which the acceptance threshold is 40 out of 100, and therefore the response variable (the test scores) are truncated below by 40 and above by 100.
Dataset Splits No The paper describes data simulation and truncation to acquire a certain number of data points for estimation, but it does not specify any training, validation, or test dataset splits or cross-validation setups.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions implementation details like using 'Python' and methods from 'Jitkrittum et al. (2017) for fast computation', but it does not provide specific version numbers for Python or any libraries/frameworks (e.g., PyTorch, TensorFlow, NumPy, SciPy, scikit-learn).
Experiment Setup Yes TKSD requires the selection of hyperparameters: the number of boundary points, m, the choice of kernel function, k, and the corresponding kernel hyperparameters. For this work, we focus on the Gaussian kernel, k(x, y) = exp{ (2σ2) 1 x y 2}, and the bandwidth parameter σ is chosen heuristically as the median of pairwise distances on the data matrix. In choosing m, there is a trade-off between accuracy and computational expense, since K in (24) is an m m matrix which requires inversion. In experiments, we let m scale with d2.