Learning Sparse Group Models Through Boolean Relaxation
Authors: Yijie Wang, Yuan Zhou, Xiaoqing Huang, Kun Huang, Jie Zhang, Jianzhu Ma
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the power of our equivalent condition by applying it to two ensembles of random problem instances that are challenging and popularly used in literature and prove that our method achieves the exactness with overwhelming probability and the nearly optimal sample complexity. Empirically, we use synthetic datasets to demonstrate that our proposed method significantly outperforms the state-of-the-art group sparse learning models in terms of individual and group support recovery when the number of samples is small. Furthermore, we show the out-performance of our method in cancer drug response prediction. |
| Researcher Affiliation | Collaboration | 1Computer Science Department, Indiana University Bloomington 2Yau Mathematical Sciences Center and Department of Mathematical Sciences, Tsinghua University 3Department of Biostatistics & Health Data Science, Indiana University 4Department of Medical and Molecular Genetics, Indiana University 5Institute for AI Industry Research, Tsinghua University |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. It describes the rounding method in text but not in a structured algorithm format. |
| Open Source Code | Yes | The codes for the proposed method can be found here: https://anonymous.4open.science/r/L0GL-F107/Readme |
| Open Datasets | Yes | We collect drug response data from the Cancer Therapeutics Response Portal (CTRP) v2 and the Genomics of Drug Sensitivity in Cancer (GDSC) database Seashore-Ludlow et al. (2015); Yang et al. (2013) |
| Dataset Splits | Yes | For each drug (machine learning task), we hold 20% of the samples as the test set and used the remaining samples as training and validation set. |
| Hardware Specification | Yes | All experiments run on a computer with 8 cores 3.7GHz Intel CPU and 32 GB RAM. |
| Software Dependencies | Yes | We apply the projected Quasi-Newton method Schmidt et al. (2009) to efficiently solve equation 10. ... The projection on the relaxed constraint set Ωcan be efficiently obtained by a commercial solver (we use Gurobi Gurobi Optimization, LLC (2022)). |
| Experiment Setup | Yes | We use both simulated datasets (non-overlapping groups) introduced in Sections 3.1 and 3.2 and a real-world application (overlapping groups) in cancer to evaluate the performance. ... For our method, because k and h are given, we only have one parameter ρ in (1) left, we select ρ by the 5-fold CV in terms of MSE. For the result of the methods, which cannot control k and h, we just sweep the parameters to let them yield the desired k and h. For the real-world application, we select the parameters in terms of out-of-sample MSE. |