Exponentiated Strongly Rayleigh Distributions

Authors: Zelda E. Mariet, Suvrit Sra, Stefanie Jegelka

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate some of the potential of ESRs, by applying them to a few machine learning problems; empirical results confirm that beyond their theoretical appeal, ESR-based models hold significant promise for these tasks. An empirical evaluation of ESR measures on various machine learning tasks, showing that ESR measures outperform standard SR models on several problems requiring a delicate balance of subset quality and diversity. We verified empirically that ESR measures and the algorithms we derive are valuable modeling tools for machine learning tasks, such as outlier detection and kernel reconstruction.
Researcher Affiliation Academia Zelda Mariet Massachusetts Institute of Technology zelda@csail.mit.edu Suvrit Sra Massachusetts Institute of Technology suvrit@mit.edu Stefanie Jegelka Massachusetts Institute of Technology stefje@csail.mit.edu
Pseudocode Yes Algorithm 1 Proposal-based sampling. Algorithm 2 Swap-chain sampling.
Open Source Code No The paper does not contain an explicit statement about releasing source code or provide a link to a code repository.
Open Datasets Yes We detect outliers on three public datasets: the UCI Breast Cancer Wisconsin dataset [46] modified as in [24, 28] as well as the Letter and Speech datasets fom [39]. We apply Kernel Ridge Regression to 3 regression datasets: Ailerons, Bank32NH, and Machine CPU2. http://www.dcc.fc.up.pt/~ltorgo/Regression/Data Sets.html
Dataset Splits Yes We subsample 4,000 points from each dataset (3,000 training and 1,000 test) and use an RBF kernel and choose the bandwidth β and regularization parameter λ for each dataset by 10-fold cross-validation.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. It implicitly indicates that experiments were run by the authors but without specific machine specifications.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used (e.g., Python, PyTorch, scikit-learn versions).
Experiment Setup Yes choose the bandwidth β and regularization parameter λ for each dataset by 10-fold cross-validation. Results are averaged over 3 random subsets of data, using the swapchain sampler initialized with k-means++ and run for 3000 iterations.