reproducibilityindex.ai

Dataset Meta-Learning from Kernel Ridge-Regression

Authors: Timothy Nguyen, Zhourong Chen, Jaehoon Lee

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform three sets of experiments to validate the efﬁcacy of KIP and LS for dataset learning. We focus on MNIST (Le Cun et al., 2010) and CIFAR-10 (Krizhevsky et al., 2009) datasets for comparison to previous methods.
Researcher Affiliation	Industry	Timothy Nguyen Zhourong Chen Jaehoon Lee Google Research {timothycnguyen, zrchen, jaehlee}@google.com
Pseudocode	Yes	Algorithm 1: Kernel Inducing Point (KIP )
Open Source Code	Yes	We provide an open source implementation of KIP and LS , available in an interactive Colab notebook1. 1https://colab.research.google.com/github/google-research/google-research/blob/master/kip/KIP.ipynb
Open Datasets	Yes	We focus on MNIST (Le Cun et al., 2010) and CIFAR-10 (Krizhevsky et al., 2009) datasets for comparison to previous methods.
Dataset Splits	No	We could have used a validation dataset for a stopping criterion, but that would have required reducing the target dataset from the entire training dataset.
Hardware Specification	Yes	using a single V100 GPU with 16GB of RAM
Software Dependencies	No	All our kernel-based experiments use the Neural Tangents library (Novak et al., 2020), built on top of JAX (Bradbury et al., 2018).
Experiment Setup	Yes	In all KIP trainings, we used the Adam optimizer. All our labels are mean-centered 1-hot labels. We used learning rates 0.01 and 0.04 for the MNIST and CIFAR-10 datasets, respectively. When sampling target batches, we always do so in a class-balanced way. All datasets are preprocessed using channel-wise standardization (i.e. mean subtraction and division by standard-deviation). For neural (tangent) kernels, we always use weight and bias variance σ2 w = 2 and σ2 b = 10 4, respectively. For both neural kernels and neural networks, we always use Re LU activation. Convolutional layers all use a (3, 3) ﬁlter with stride 1 and same padding. We train KIP for 10-20k iterations and took 5 random subsets of images for initializations.