Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Reparameterizable Subset Sampling via Continuous Relaxations

Authors: Sang Michael Xie, Stefano Ermon

IJCAI 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiments 4.1 Synthetic Experiments We check that generating samples from Algorithm 2 results in samples approximately from p(S|w). We define a subset distribution using weights w = [0.1, 0.2, 0.3, 0.4] and take subset size k = 2. Using Algorithm 2 and the top-k relaxation from [Pl otz and Roth, 2018], we sample 10000 relaxed k-hot samples for each temperature in {0.1, 1, 10} and take the the top-k values in the relaxed k-hot vector as the chosen subset. We plot the empirical histogram of subset occurrences and compare with the true probabilities from the subset distribution, with 95% confidence intervals. The relaxation produces subset samples with empirical distribution within 0.016 in total variation distance of p(S|w) for all t. This agrees with Theorem 1, which states that even for higher temperatures, taking the top-k values in the relaxed k-hot vector should produce true samples from (2).
Researcher Affiliation Academia Sang Michael Xie and Stefano Ermon Stanford University EMAIL
Pseudocode Yes Algorithm 1 Weighted Reservoir Sampling (non-streaming) Input: Items x1, . . . , xn, weights w = [w1, . . . , wn], reservoir size k Output: Swrs = [ei1, . . . , eik] a sample from p(Swrs|w) 1: r [ ] 2: for i 1 to n do 3: ui Uniform(0, 1) 4: ri u1/wi i # Sample random keys 5: r.append(ri) 6: end for 7: [ei1, . . . , eik] Top K(r, k) 8: return [ei1, . . . , eik]
Open Source Code Yes 1Code available at https://github.com/ermongroup/subsets.
Open Datasets Yes We test our results on the Large Movie Review Dataset (IMDB) for sentiment classification [Maas et al., 2011]... We compare with parametric t-SNE [van der Maaten, 2009] on the MNIST [Le Cun and Cortes, 2010] and a small version of the 20 Newsgroups dataset [Roweis, 2009]2.
Dataset Splits Yes We use cross validation to choose temperatures t {0.1, 0.5, 1, 2, 5} according to the validation loss. ...and search over temperatures t = {0.1, 1, 5, 16, 64} using the validation set...
Hardware Specification Yes Results were obtained from a Titan Xp GPU.
Software Dependencies No The paper does not provide specific software dependency versions (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Following L2X, we use k = 10 for IMDB-word and k = 1 sentences for IMDB-sent. ...We fix k = 9 nearest neighbors to choose from m candidates and search over temperatures t = {0.1, 1, 5, 16, 64} using the validation set... For all experiments, we set t = 0.1 and train for 200 epochs with a batch size of 1000...