Learning Sparse Group Models Through Boolean Relaxation

Authors: Yijie Wang, Yuan Zhou, Xiaoqing Huang, Kun Huang, Jie Zhang, Jianzhu Ma

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the power of our equivalent condition by applying it to two ensembles of random problem instances that are challenging and popularly used in literature and prove that our method achieves the exactness with overwhelming probability and the nearly optimal sample complexity. Empirically, we use synthetic datasets to demonstrate that our proposed method significantly outperforms the state-of-the-art group sparse learning models in terms of individual and group support recovery when the number of samples is small. Furthermore, we show the out-performance of our method in cancer drug response prediction.
Researcher Affiliation Collaboration 1Computer Science Department, Indiana University Bloomington 2Yau Mathematical Sciences Center and Department of Mathematical Sciences, Tsinghua University 3Department of Biostatistics & Health Data Science, Indiana University 4Department of Medical and Molecular Genetics, Indiana University 5Institute for AI Industry Research, Tsinghua University
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks. It describes the rounding method in text but not in a structured algorithm format.
Open Source Code Yes The codes for the proposed method can be found here: https://anonymous.4open.science/r/L0GL-F107/Readme
Open Datasets Yes We collect drug response data from the Cancer Therapeutics Response Portal (CTRP) v2 and the Genomics of Drug Sensitivity in Cancer (GDSC) database Seashore-Ludlow et al. (2015); Yang et al. (2013)
Dataset Splits Yes For each drug (machine learning task), we hold 20% of the samples as the test set and used the remaining samples as training and validation set.
Hardware Specification Yes All experiments run on a computer with 8 cores 3.7GHz Intel CPU and 32 GB RAM.
Software Dependencies Yes We apply the projected Quasi-Newton method Schmidt et al. (2009) to efficiently solve equation 10. ... The projection on the relaxed constraint set Ωcan be efficiently obtained by a commercial solver (we use Gurobi Gurobi Optimization, LLC (2022)).
Experiment Setup Yes We use both simulated datasets (non-overlapping groups) introduced in Sections 3.1 and 3.2 and a real-world application (overlapping groups) in cancer to evaluate the performance. ... For our method, because k and h are given, we only have one parameter ρ in (1) left, we select ρ by the 5-fold CV in terms of MSE. For the result of the methods, which cannot control k and h, we just sweep the parameters to let them yield the desired k and h. For the real-world application, we select the parameters in terms of out-of-sample MSE.