reproducibilityindex.ai

Why the Rich Get Richer? On the Balancedness of Random Partition Models

Authors: Changwoo J Lee, Huiyan Sang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we demonstrate the effectiveness of balance-seeking random partition for the ER task using the Survey of Income and Program Participation data (U.S. Census Bureau, 2009).
Researcher Affiliation	Academia	1Department of Statistics, Texas A&M University, Texas, USA.
Pseudocode	No	The paper describes algorithms and inference steps in text and mathematical formulas (e.g., in Appendix C), but it does not present them in a structured pseudocode or algorithm block format.
Open Source Code	Yes	The software for the described posterior inference algorithms will be available in R package microclustr (Steorts et al., 2020).
Open Datasets	Yes	We use the same dataset (SIPP1000) that Betancourt et al. (2020) used to benchmark the performance; the database with n = 4116 (number of records) and K+ = 1000 (number of entities) was collected from the five waves of the longitudinal survey performed between 2005-2006. (U.S. Census Bureau, 2009).
Dataset Splits	No	The paper conducts experiments and simulation studies, but does not specify training, validation, or testing splits for the datasets used. Evaluation metrics like FNR and FDR are reported on the dataset directly, implying a full-data evaluation rather than a split methodology.
Hardware Specification	Yes	All computations were performed on an Intel E5-2690 v3 CPU with 128GB of memory.
Software Dependencies	No	The paper mentions that software will be available in 'R package microclustr', but it does not specify version numbers for R, the package, or any other software dependencies, which are necessary for reproducibility.
Experiment Setup	Yes	Hyperpriors and MCMC specification details are described in Appendix D. We collect 15000 posterior samples after 5000 burn-in iterations, where we update cluster indicators (zi) for each individual within each one (global) MCMC iteration.