Learning Sample-Specific Models with Low-Rank Personalized Regression
Authors: Ben Lengerich, Bryon Aragam, Eric P. Xing
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare personalized regression (hereafter, PR) to four baselines: 1) Population linear or logistic regression, 2) A mixture regression (MR) model, 3) Varying coefficients (VC), 4) Deep neural networks (DNN). First, we evaluate each method s ability to recover the true parameters from simulated data. Then we present three real data case studies, each progressively more challenging than the previous: 1) Stock prediction using financial data, 2) Cancer diagnosis from mass spectrometry data, and 3) Electoral prediction using historical election data. The results are summarized in Table 1 for easy reference. |
| Researcher Affiliation | Academia | Benjamin Lengerich Carnegie Mellon University blengeri@cs.cmu.edu Bryon Aragam University of Chicago bryon@chicagobooth.edu Eric P. Xing Carnegie Mellon University epxing@cs.cmu.edu |
| Pseudocode | Yes | Algorithm 1 Personalized Estimation |
| Open Source Code | Yes | A Python implementation is available at http://www.github.com/blengerich/ personalized_regression. |
| Open Datasets | Yes | Here, we investigate the capacity of PR to distinguish malignant from benign skin lesions using a dataset of desorption electrospray ionization mass spectrometry imaging (DESI-MSI) of a common skin cancer, basal cell carcinoma (BCC) [22] (details in supplement). |
| Dataset Splits | No | The paper refers to 'test sets' and 'out-of-sample prediction results', implying a split for evaluation, but does not specify explicit percentages, counts, or a standard citation for train/validation/test dataset splits. |
| Hardware Specification | Yes | With these performance improvements, we are able to fit models to datasets with over 10,000 samples and 1000s of predictors on a Macbook Pro with 16GB RAM in under an hour. |
| Software Dependencies | No | The paper mentions 'A Python implementation' but does not specify specific software dependencies with version numbers. |
| Experiment Setup | Yes | A discussion of hyperparameter selection is contained in Section. B.3 of the supplement. and Each personalized estimator is endowed with a personalized learning rate (i) t = t/kb (i) t b (pop)k1, which scales the global learning rate t according to how far the estimator has traveled. and In our experiments, we use kn = 3. |