Meta-learning for Mixed Linear Regression

Authors: Weihao Kong, Raghav Somani, Zhao Song, Sham Kakade, Sewoong Oh

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments. We set d = 8k, p = 1k/k, s = 1k, and Px and Pϵ are standard Gaussian distributions. Given an estimated subspace with estimation error 0.1, the clustering step is performed with n H = max k3/2, 256 tasks for various t H. The minimum t H such that the clustering accuracy is above 99% for at-least 1 δ fraction of 10 random trials is denoted by tmin(1 δ). Figure 3 and Table 3 illustrate the dependence of k on tmin(0.5), and tmin(0.9). More experimental results are provided in Appendix E.Figure 2. Bayes optimal estimator achieves smaller errors for an example. Here, k = 32, d = 256, W W = Ik, s = 1k, p = 1k/k, and Px and Pϵ are standard Gaussian distributions. The parameters were learnt using the Meta-learning part of Algorithm 1 as a continuation of simulations discussed in Appendix E, where we provide extensive experiments confirming our analyses.
Researcher Affiliation Academia 1University of Washington, Seattle, Washington, USA 2Princeton University/Institute for Advanced Study. Correspondence to: Weihao Kong <kweihao@gmail.com>, Raghav Somani <raghavs@cs.washington.edu>, Zhao Song <zhaos@ias.edu>, Sham Kakade <sham@cs.washington.edu>, Sewoong Oh <sewoong@cs.washington.edu>.
Pseudocode Yes Algorithm 1 Meta-learning, Algorithm 2 Subspace estimation, Algorithm 3 Clustering and estimation, Algorithm 4 Classification and estimation
Open Source Code No The paper does not provide any explicit statements about making the source code available, nor does it include links to a code repository.
Open Datasets No The paper defines the task setup and data generation process (e.g., 'tasks are linear regressions', 'xi,j Px'), but does not specify or provide access information for any publicly available or open dataset used for training.
Dataset Splits No The paper does not explicitly provide specific dataset split information (e.g., exact percentages, sample counts, or citations to predefined splits) for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers, used for the experiments.
Experiment Setup Yes Experiments. We set d = 8k, p = 1k/k, s = 1k, and Px and Pϵ are standard Gaussian distributions. Given an estimated subspace with estimation error 0.1, the clustering step is performed with n H = max k3/2, 256 tasks for various t H.Here, k = 32, d = 256, W W = Ik, s = 1k, p = 1k/k, and Px and Pϵ are standard Gaussian distributions.