Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Discovering Structure in High-Dimensional Data Through Correlation Explanation

Authors: Greg Ver Steeg, Aram Galstyan

NeurIPS 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that Correlation Explanation (Cor Ex) automatically discovers meaningful structure for data from diverse sources including personality tests, DNA, and human language.
Researcher Affiliation	Academia	Greg Ver Steeg Information Sciences Institute University of Southern California Marina del Rey, CA 90292 EMAIL Aram Galstyan Information Sciences Institute University of Southern California Marina del Rey, CA 90292 EMAIL
Pseudocode	Yes	Algorithm 1: Pseudo-code implementing Correlation Explanation (Cor Ex)
Open Source Code	Yes	Open source code is available at http://github.com/gregversteeg/Cor Ex.
Open Datasets	Yes	Data and full list of questions are available at http://personality-testing.info/ _rawdata/. Data, descriptions of SNPs, and detailed demographics of subjects is available at ftp://ftp.cephb. fr/hgdp_v3/. [14] K. Bache and M. Lichman. UCI machine learning repository, 2013.
Dataset Splits	No	The paper does not provide specific details on training, validation, or test dataset splits for its experiments. It mentions generating samples but not how they are partitioned for model training and evaluation.
Hardware Specification	No	The paper mentions GPUs in the context of future scalability for neural networks but does not specify the hardware used for its own experiments.
Software Dependencies	No	The paper mentions 'scikit-learn' in a reference but does not specify that it was used in their experiments with a version number, nor does it list other software dependencies with versions.
Experiment Setup	No	The paper describes general aspects of the optimization and mentions that parameters like \lambda and \gamma can be set through arguments described in Sec. B, but the actual concrete values for hyperparameters or other system-level training settings are not provided in the main text.