reproducibilityindex.ai

Exponentiated Strongly Rayleigh Distributions

Authors: Zelda E. Mariet, Suvrit Sra, Stefanie Jegelka

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate some of the potential of ESRs, by applying them to a few machine learning problems; empirical results conﬁrm that beyond their theoretical appeal, ESR-based models hold signiﬁcant promise for these tasks. An empirical evaluation of ESR measures on various machine learning tasks, showing that ESR measures outperform standard SR models on several problems requiring a delicate balance of subset quality and diversity. We veriﬁed empirically that ESR measures and the algorithms we derive are valuable modeling tools for machine learning tasks, such as outlier detection and kernel reconstruction.
Researcher Affiliation	Academia	Zelda Mariet Massachusetts Institute of Technology zelda@csail.mit.edu Suvrit Sra Massachusetts Institute of Technology suvrit@mit.edu Stefanie Jegelka Massachusetts Institute of Technology stefje@csail.mit.edu
Pseudocode	Yes	Algorithm 1 Proposal-based sampling. Algorithm 2 Swap-chain sampling.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or provide a link to a code repository.
Open Datasets	Yes	We detect outliers on three public datasets: the UCI Breast Cancer Wisconsin dataset [46] modiﬁed as in [24, 28] as well as the Letter and Speech datasets fom [39]. We apply Kernel Ridge Regression to 3 regression datasets: Ailerons, Bank32NH, and Machine CPU2. http://www.dcc.fc.up.pt/~ltorgo/Regression/Data Sets.html
Dataset Splits	Yes	We subsample 4,000 points from each dataset (3,000 training and 1,000 test) and use an RBF kernel and choose the bandwidth β and regularization parameter λ for each dataset by 10-fold cross-validation.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. It implicitly indicates that experiments were run by the authors but without specific machine specifications.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies or libraries used (e.g., Python, PyTorch, scikit-learn versions).
Experiment Setup	Yes	choose the bandwidth β and regularization parameter λ for each dataset by 10-fold cross-validation. Results are averaged over 3 random subsets of data, using the swapchain sampler initialized with k-means++ and run for 3000 iterations.