Flexible Modeling of Diversity with Strongly Log-Concave Distributions

Authors: Joshua Robinson, Suvrit Sra, Stefanie Jegelka

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6 Experiments In this section we empirically evaluate the mixing time of Algorithm 1. We use the standard potential scale reduction factor metric to measure convergence to the stationary distribution [11]. The method involves running several chains in parallel and computing the average variance within each chain and between the chains. The PSRF score is the ratio of the between variance over the within variance and is usually above 1. When the PSRF score is close to 1 then the chains are considered to be mixed. In all of our experiments we run three chains in parallel and declare them to be mixed once the PSRF score drops below 1.05. Figure 1 considers the results of running the Metropolis-Hastings algorithm on a sequence of problems with different cardinality constraints d. In each case we considered the distribution det(LS)1{|S| d} where L is a randomly generated 250 250 PSD matrix. Here LS denotes the|S| |S| submatrix of L whose indices belong to S. These simulations suggest that the mixing time grows linearly in d for a fixed n.
Researcher Affiliation Academia Joshua Robinson Massachusetts Institute of Technology joshrob@mit.edu Suvrit Sra Massachusetts Institute of Technology suvrit@mit.edu Stefanie Jegelka Massachusetts Institute of Technology stefje@csail.mit.edu
Pseudocode Yes Algorithm 1 Metropolis-Hastings sampler for νsh with proposal Q, Algorithm 2 Distorted greedy weak submodular constrained maximization of ν = η c
Open Source Code No No explicit statement or link regarding the release of source code for the methodology described in this paper was found.
Open Datasets No In each case we considered the distribution det(LS)1{|S| d} where L is a randomly generated 250 250 PSD matrix. Here LS denotes the|S| |S| submatrix of L whose indices belong to S. (This describes how the data is generated, not a publicly accessible dataset with concrete access info.)
Dataset Splits No No specific dataset split information (e.g., percentages, sample counts for train/validation/test) was found.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned.
Software Dependencies No No specific ancillary software details with version numbers were mentioned.
Experiment Setup Yes In all of our experiments we run three chains in parallel and declare them to be mixed once the PSRF score drops below 1.05. In each case we considered the distribution det(LS)1{|S| d} where L is a randomly generated 250 250 PSD matrix. ... In each case we considered the distribution u(S) p det(LS)1{|S| 40} where L is a randomly generated PSD matrix where of appropriate size n.