The Well-Tempered Lasso
Authors: Yuanzhi Li, Yoram Singer
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We performed two sets of experiments. Our path following implementation uses Python with Float128 for high accuracy computations. In the first set of experiments, we start with the exponential complexity construction for Xh Rd d from (Mairal & Yu, 2012), which has (3d + 1)/2 many line segments. We artificially added to each entry of Xh i.i.d. Gaussian noise of mean zero and variance σ2. In this setting, the largest value of the entries of Xh is 1 and y is an all-one vector. We show the effect of dimension d and smoothing σ on N(P). We report the average over 100 random choices for smoothing per Xh. As can be seen from the figure below, even for a tiny amount of entry-wise noisy of 10 10, the number of linear segments dramatically shrinks. We also include a full table of results, where 1/SNR denotes log10(σ). For the next experiment we use the MNIST data set. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, Princeton University 2Googol Brain. |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | Yes | For the next experiment we use the MNIST data set. We randomly selected n = 1000 images from the data set. We constructed the data matrix X Rn d2 such that the i th row of X is a randomly chosen patch from the i th image of size d d. We cast the center pixel of patch i as the target yi and discard the pixel from X. |
| Dataset Splits | No | The paper mentions selecting random samples and patches but does not provide specific percentages, counts, or methods for training, validation, or test splits, nor does it specify cross-validation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. It only mentions |
| Software Dependencies | No | The paper mentions |
| Experiment Setup | Yes | In the first set of experiments, we start with the exponential complexity construction for Xh Rd d from (Mairal & Yu, 2012), which has (3d + 1)/2 many line segments. We artificially added to each entry of Xh i.i.d. Gaussian noise of mean zero and variance σ2. In this setting, the largest value of the entries of Xh is 1 and y is an all-one vector. ... For the next experiment we use the MNIST data set. We randomly selected n = 1000 images from the data set. We constructed the data matrix X Rn d2 such that the i th row of X is a randomly chosen patch from the i th image of size d d. We cast the center pixel of patch i as the target yi and discard the pixel from X. |