reproducibilityindex.ai

Estimating Unknown Population Sizes Using the Hypergeometric Distribution

Authors: Liam Hodgson, Danilo Bzdok

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical data simulation demonstrates that our method outperforms other likelihood functions used to model count data, both in terms of accuracy of population size estimate and learning an informative latent space.
Researcher Affiliation	Academia	1Mc Gill University, Montr eal, Canada 2Mila Qu ebec Artificial Intelligence Institute.
Pseudocode	Yes	Algorithm 1 Dataset simulation
Open Source Code	No	The paper does not provide explicit statements or links indicating that the source code for the methodology is open-source or publicly available.
Open Datasets	Yes	We test this hypothesis using the Common Lit Ease of Readability (CLEAR) Corpus (Crossley et al., 2023), an open-source dataset consisting of almost 5000 text excerpts sourced from Grade 3-12 reading curricula.
Dataset Splits	No	The paper describes training models on datasets but does not provide specific details on train/validation/test splits, percentages, or explicit methodologies for splitting data for reproducibility of model evaluation.
Hardware Specification	No	The paper does not provide specific details regarding the hardware (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies	No	The paper does not specify software dependencies with version numbers (e.g., specific library versions for Python, PyTorch, or other tools).
Experiment Setup	Yes	Model and training hyperparameters are given in Appendix B. Table 2 provides: Encoder layers 128, 128; Decoder layers 128, 128; Latent space dimension 10; Learning rate 0.01; Batch size 100; Violation penalty (min/max) 1 (for Simulated and CLEAR) or 1/100 (for SPIKE).