Towards Practical Mean Bounds for Small Samples
Authors: My Phan, Philip Thomas, Erik Learned-Miller
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In simulations, we show that for many distributions, the gain over Anderson s bound is substantial. 5. Simulations We perform simulations to compare our bounds to Hoeffding s inequality, Anderson s bound, Maurer and Pontil s, and Student-t s bound (Student, 1908), the latter being |
| Researcher Affiliation | Academia | My Phan 1 Philip S. Thomas 1 Erik Learned-Miller 1 1College of Information and Computer Sciences, University of Massachusetts, Amherst, MA, USA. Correspondence to: My Phan <myphan@cs.umass.edu>. |
| Pseudocode | Yes | Algorithm 1 Monte Carlo estimation of mα D+,T (x) where D+ = [0, 1]. This pseudocode uses 1-based array indexing. |
| Open Source Code | Yes | Code accompanying this paper is available at https://github.com/myphan9/small_sample_mean_bounds. |
| Open Datasets | No | We perform experiments on three distributions: beta(1, 5) (skewed right), uniform(0, 1) and beta(5, 1) (skewed left). Their PDFs are included in the supplementary material for reference. The paper uses synthetic data generated from these distributions, not pre-existing publicly available datasets that require a specific link or citation for access. |
| Dataset Splits | No | The paper conducts simulations by sampling from specified distributions (beta(1,5), uniform(0,1), beta(5,1)) for various sample sizes, but does not mention traditional train/validation/test dataset splits as it's not training a model on a fixed dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, or memory specifications) used for running the simulations. |
| Software Dependencies | No | The paper describes algorithms and statistical methods, but does not specify any software dependencies or libraries with version numbers (e.g., Python, PyTorch, SciPy versions) used for implementation or simulation. |
| Experiment Setup | Yes | We use α = 0.05, D+ = [0, 1] and l = 10,000 Monte Carlo samples. We consider two functions T: 1. Anderson: T(x) = bα,Anderson ℓ (x), again with ℓ= u And. Because this T is linear in x, it can be computed with the linear program in Eq. 42. 2. l2 norm: T(x) = (Pn i=1 x2 i )/n. |