A Screening Rule for l1-Regularized Ising Model Estimation
Authors: Zhaobin Kuang, Sinong Geng, David Page
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on various datasets demonstrate the efficiency and insights gained from the introduction of the screening rule. Experiments are conducted on both synthetic data and real world data. |
| Researcher Affiliation | Academia | Zhaobin Kuang1, Sinong Geng2, David Page3 University of Wisconsin zkuang@wisc.edu1, sgeng2@wisc.edu2, page@biostat.wisc.edu3 |
| Pseudocode | Yes | Algorithm 1 Blockwise Minimization 1: Input: dataset X, regularization parameter λ. 2: Output: ˆθ. 3: i, j V such that j > i, compute the second empirical moments EXXi Xj s . 4: Identify the partition {C1, C2, , CL} using the second empirical moments from the previous step and according to Witten et al. [2011], Mazumder and Hastie [2012]. 5: l L, perform blockwise optimization over Cl for ˆθl. 6: Ensemble ˆθl s according to (3) for ˆθ. 7: Return ˆθ. |
| Open Source Code | No | The paper mentions using existing software like 'glmnet' and 'TCGA2STAT package' but does not provide specific access to their own source code for the methodology described. |
| Open Datasets | Yes | Our real world data experiment applies NW with and without screening to a real world gene mutation dataset collected from 178 lung squamous cell carcinoma samples [Weinstein et al., 2013]. Mutation data are extracted via the TCGA2STAT package [Wan et al., 2015] in R |
| Dataset Splits | Yes | We leverage the Stability Approach to Regularization Selection (St ARS, Liu et al. 2010) for this task. In a nutshell, St ARS learns a set of various models, denoted as M, over Λ using many subsamples that are drawn randomly from the original dataset without replacement. In Figure 2, we summarize the experimental results of model selection, where 24 subsamples are used for pathwise optimization in parallel to construct M. |
| Hardware Specification | Yes | The experiments are conducted on a Power Edge R720 server with two Intel(R) Xeon(R) E5-2620 CPUs and 128GB RAM. As many as 24 threads can be run in parallel. |
| Software Dependencies | No | The paper mentions using 'glmnet' in 'R' and the 'TCGA2STAT package' and 'Cytoscape', but it does not provide specific version numbers for any of these software components. |
| Experiment Setup | Yes | We start by choosing a λ1 that reflects the sparse blockwise structural assumption on the data... we choose λ1 such that the number of the absolute second empirical moments that are greater than λ1 is about p log p... We then choose λτ such that the number of absolute second empirical moments that are greater than λτ is about p. In our experiments, we use an evenly spaced Λ with τ = 25. For model selection... 24 subsamples are used for pathwise optimization in parallel... For real world data... we choose τ = 25. 384 trials are run in parallel using all 24 threads. We also choose λ1 such that about 2p log(p) absolute second empirical moments are greater than λ1. We choose λτ such that about 0.25p absolute second empirical moments are greater than λτ. |