Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction
Authors: Kristofer Bouchard, Alejandro Bujan, Fred Roosta, Shashanka Ubaru, Mr. Prabhat, Antoine Snijders, Jian-Hua Mao, Edward Chang, Michael W. Mahoney, Sharmodeep Bhattacharya
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform extensive numerical investigation to evaluate a Uo I algorithm (Uo ILasso) on synthetic and real data. All numerical results used 100 random sub-samplings with replacement of 80-10-10 cross-validation to estimate model parameters (80%), choose optimal meta-parameters (e.g., λ, 10%), and determine prediction quality (10%). |
| Researcher Affiliation | Academia | Biological Systems and Engineering Division, LBNL. Redwood Center, UC Berkeley. ICSI and Department of Statistics, UC Berkeley. Department of Computer Science and Engineering, University of Minnesota. NERSC, LBNL. Biological Systems and Engineering Division, LBNL. Department of Neurological Surgery, UC San Francisco. Department of Statistics, Oregon State University. |
| Pseudocode | Yes | Figure 1: The basic Uo I framework. (c) A data-distributed version of the Uo ILasso algorithm. |
| Open Source Code | No | The paper mentions "a distributed Python-MPI implementation" but does not provide a link or explicit statement that the source code for their methodology is publicly available. |
| Open Datasets | Yes | Neurobiology seeks to understand the brain across multiple spatio-temporal scales, from molecules-to-minds. We first tackled the problem of graph formation from multi-electrode (p = 86 electrodes) neural recordings taken directly from the surface of the human brain during speech production (n = 45 trials each). See [7] for details. [7] K. E. Bouchard, N. Mesgarani, K. Johnson, and E. F. Chang. Functional organization of human sensorimotor cortex for speech articulation. Nature, 495(7441):327 332, 2013. We analyzed data from n = 365 mice (173 female, 192 male) that are part of the genetically diverse Collaborative Cross cohort. See [14] for details. [14] J.-H. Mao, S. A. Langley, Y. Huang, M. Hang, K. E. Bouchard, S. E. Celniker, J. B. Brown, J. K. Jansson, G. H. Karpen, and A. M. Snijders. Identification of genetic factors that modify motor performance and body weight using collaborative cross mice. Scientific Reports, 5:16247, 2015. |
| Dataset Splits | Yes | All numerical results used 100 random sub-samplings with replacement of 80-10-10 cross-validation to estimate model parameters (80%), choose optimal meta-parameters (e.g., λ, 10%), and determine prediction quality (10%). |
| Hardware Specification | No | The paper discusses a "distributed Python-MPI implementation" and parallelization aspects, but it does not specify any hardware details such as CPU/GPU models, memory, or specific computing environments used for the experiments. |
| Software Dependencies | No | The paper mentions a "Python-MPI implementation" but does not provide specific version numbers for Python, MPI, or any other software libraries or dependencies used in their experiments. |
| Experiment Setup | Yes | All numerical results used 100 random sub-samplings with replacement of 80-10-10 cross-validation to estimate model parameters (80%), choose optimal meta-parameters (e.g., λ, 10%), and determine prediction quality (10%). For any regularized regression method like in (2), a decrease in the penalization parameter (λ) tends to increase the number of false positives, and an increase in λ tends to increase false negatives. A large number of bootstrap resamples in the intersection step (B1) and in the union step (B2) are discussed as parameters controlling false positives, false negatives, and estimate stability. |