Meta-learning for Mixed Linear Regression
Authors: Weihao Kong, Raghav Somani, Zhao Song, Sham Kakade, Sewoong Oh
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments. We set d = 8k, p = 1k/k, s = 1k, and Px and Pϵ are standard Gaussian distributions. Given an estimated subspace with estimation error 0.1, the clustering step is performed with n H = max k3/2, 256 tasks for various t H. The minimum t H such that the clustering accuracy is above 99% for at-least 1 δ fraction of 10 random trials is denoted by tmin(1 δ). Figure 3 and Table 3 illustrate the dependence of k on tmin(0.5), and tmin(0.9). More experimental results are provided in Appendix E.Figure 2. Bayes optimal estimator achieves smaller errors for an example. Here, k = 32, d = 256, W W = Ik, s = 1k, p = 1k/k, and Px and Pϵ are standard Gaussian distributions. The parameters were learnt using the Meta-learning part of Algorithm 1 as a continuation of simulations discussed in Appendix E, where we provide extensive experiments confirming our analyses. |
| Researcher Affiliation | Academia | 1University of Washington, Seattle, Washington, USA 2Princeton University/Institute for Advanced Study. Correspondence to: Weihao Kong <kweihao@gmail.com>, Raghav Somani <raghavs@cs.washington.edu>, Zhao Song <zhaos@ias.edu>, Sham Kakade <sham@cs.washington.edu>, Sewoong Oh <sewoong@cs.washington.edu>. |
| Pseudocode | Yes | Algorithm 1 Meta-learning, Algorithm 2 Subspace estimation, Algorithm 3 Clustering and estimation, Algorithm 4 Classification and estimation |
| Open Source Code | No | The paper does not provide any explicit statements about making the source code available, nor does it include links to a code repository. |
| Open Datasets | No | The paper defines the task setup and data generation process (e.g., 'tasks are linear regressions', 'xi,j Px'), but does not specify or provide access information for any publicly available or open dataset used for training. |
| Dataset Splits | No | The paper does not explicitly provide specific dataset split information (e.g., exact percentages, sample counts, or citations to predefined splits) for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers, used for the experiments. |
| Experiment Setup | Yes | Experiments. We set d = 8k, p = 1k/k, s = 1k, and Px and Pϵ are standard Gaussian distributions. Given an estimated subspace with estimation error 0.1, the clustering step is performed with n H = max k3/2, 256 tasks for various t H.Here, k = 32, d = 256, W W = Ik, s = 1k, p = 1k/k, and Px and Pϵ are standard Gaussian distributions. |