reproducibilityindex.ai

Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction

Authors: Kristofer Bouchard, Alejandro Bujan, Fred Roosta, Shashanka Ubaru, Mr. Prabhat, Antoine Snijders, Jian-Hua Mao, Edward Chang, Michael W. Mahoney, Sharmodeep Bhattacharya

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform extensive numerical investigation to evaluate a Uo I algorithm (Uo ILasso) on synthetic and real data. All numerical results used 100 random sub-samplings with replacement of 80-10-10 cross-validation to estimate model parameters (80%), choose optimal meta-parameters (e.g., λ, 10%), and determine prediction quality (10%).
Researcher Affiliation	Academia	Biological Systems and Engineering Division, LBNL. Redwood Center, UC Berkeley. ICSI and Department of Statistics, UC Berkeley. Department of Computer Science and Engineering, University of Minnesota. NERSC, LBNL. Biological Systems and Engineering Division, LBNL. Department of Neurological Surgery, UC San Francisco. Department of Statistics, Oregon State University.
Pseudocode	Yes	Figure 1: The basic Uo I framework. (c) A data-distributed version of the Uo ILasso algorithm.
Open Source Code	No	The paper mentions "a distributed Python-MPI implementation" but does not provide a link or explicit statement that the source code for their methodology is publicly available.
Open Datasets	Yes	Neurobiology seeks to understand the brain across multiple spatio-temporal scales, from molecules-to-minds. We ﬁrst tackled the problem of graph formation from multi-electrode (p = 86 electrodes) neural recordings taken directly from the surface of the human brain during speech production (n = 45 trials each). See [7] for details. [7] K. E. Bouchard, N. Mesgarani, K. Johnson, and E. F. Chang. Functional organization of human sensorimotor cortex for speech articulation. Nature, 495(7441):327 332, 2013. We analyzed data from n = 365 mice (173 female, 192 male) that are part of the genetically diverse Collaborative Cross cohort. See [14] for details. [14] J.-H. Mao, S. A. Langley, Y. Huang, M. Hang, K. E. Bouchard, S. E. Celniker, J. B. Brown, J. K. Jansson, G. H. Karpen, and A. M. Snijders. Identiﬁcation of genetic factors that modify motor performance and body weight using collaborative cross mice. Scientiﬁc Reports, 5:16247, 2015.
Dataset Splits	Yes	All numerical results used 100 random sub-samplings with replacement of 80-10-10 cross-validation to estimate model parameters (80%), choose optimal meta-parameters (e.g., λ, 10%), and determine prediction quality (10%).
Hardware Specification	No	The paper discusses a "distributed Python-MPI implementation" and parallelization aspects, but it does not specify any hardware details such as CPU/GPU models, memory, or specific computing environments used for the experiments.
Software Dependencies	No	The paper mentions a "Python-MPI implementation" but does not provide specific version numbers for Python, MPI, or any other software libraries or dependencies used in their experiments.
Experiment Setup	Yes	All numerical results used 100 random sub-samplings with replacement of 80-10-10 cross-validation to estimate model parameters (80%), choose optimal meta-parameters (e.g., λ, 10%), and determine prediction quality (10%). For any regularized regression method like in (2), a decrease in the penalization parameter (λ) tends to increase the number of false positives, and an increase in λ tends to increase false negatives. A large number of bootstrap resamples in the intersection step (B1) and in the union step (B2) are discussed as parameters controlling false positives, false negatives, and estimate stability.