Learning-to-Learn Stochastic Gradient Descent with Biased Regularization
Authors: Giulia Denevi, Carlo Ciliberto, Riccardo Grazzi, Massimiliano Pontil
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments demonstrate the effectiveness of our approach in practice. |
| Researcher Affiliation | Academia | 1Istituto Italiano di Tecnologia, Genoa, Italy 2University of Genoa, Genoa, Italy 3Imperial College of London, London, United Kingdom 4University College London, London, United Kingdom. |
| Pseudocode | Yes | Algorithm 1 Within-Task Algorithm: SGD on the Biased Regularized True Risk and Algorithm 2 Meta-Algorithm, SGD on ˆE with Subgradients. |
| Open Source Code | Yes | Code available at https://github.com/prolearner/online_LTL |
| Open Datasets | Yes | We run experiments on the computer survey data from (Lenk et al., 1996) |
| Dataset Splits | No | The paper mentions that parameters were "tuned by validation" and describes the generation of synthetic data (e.g., "inputs were uniformly sampled on the unit sphere", "labels were generated as y = hx, wµi + "), but it does not specify explicit train/validation/test split percentages or sample counts for either the synthetic or real datasets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper does not specify the version numbers for any software dependencies or libraries used in the implementation of the experiments (e.g., Python version, specific machine learning framework versions). |
| Experiment Setup | Yes | In the regression case, the inputs were uniformly sampled on the unit sphere and the labels were generated as y = hx, wµi + , with sampled from a zero-mean Gaussian distribution, with standard deviation chosen to have signal-to-noise ratio equal to 10 for each task. In all experiments, the regularization parameter λ and the stepsize γ were tuned by validation. |