Why the Rich Get Richer? On the Balancedness of Random Partition Models

Authors: Changwoo J Lee, Huiyan Sang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we demonstrate the effectiveness of balance-seeking random partition for the ER task using the Survey of Income and Program Participation data (U.S. Census Bureau, 2009).
Researcher Affiliation Academia 1Department of Statistics, Texas A&M University, Texas, USA.
Pseudocode No The paper describes algorithms and inference steps in text and mathematical formulas (e.g., in Appendix C), but it does not present them in a structured pseudocode or algorithm block format.
Open Source Code Yes The software for the described posterior inference algorithms will be available in R package microclustr (Steorts et al., 2020).
Open Datasets Yes We use the same dataset (SIPP1000) that Betancourt et al. (2020) used to benchmark the performance; the database with n = 4116 (number of records) and K+ = 1000 (number of entities) was collected from the five waves of the longitudinal survey performed between 2005-2006. (U.S. Census Bureau, 2009).
Dataset Splits No The paper conducts experiments and simulation studies, but does not specify training, validation, or testing splits for the datasets used. Evaluation metrics like FNR and FDR are reported on the dataset directly, implying a full-data evaluation rather than a split methodology.
Hardware Specification Yes All computations were performed on an Intel E5-2690 v3 CPU with 128GB of memory.
Software Dependencies No The paper mentions that software will be available in 'R package microclustr', but it does not specify version numbers for R, the package, or any other software dependencies, which are necessary for reproducibility.
Experiment Setup Yes Hyperpriors and MCMC specification details are described in Appendix D. We collect 15000 posterior samples after 5000 burn-in iterations, where we update cluster indicators (zi) for each individual within each one (global) MCMC iteration.