Knowledge-Adaptation Priors

Authors: Mohammad Emtiyaz Khan, Siddharth Swaroop

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results show that adaptation with K-priors achieves performance similar to full retraining, but only requires training on a handful of past examples. (Abstract)
Researcher Affiliation Academia Mohammad Emtiyaz Khan RIKEN Center for AI Project Tokyo, Japan emtiyaz.khan@riken.jp Siddharth Swaroop University of Cambridge Cambridge, UK ss2163@cam.ac.uk
Pseudocode No The paper does not contain any pseudocode or algorithm blocks. Methods are described in prose and mathematical equations.
Open Source Code Yes Code is available at https://github.com/team-approx-bayes/kpriors. (Section 1)
Open Datasets Yes Results on binary classification on USPS digits (Figure 1 caption); Logistic Regression on the UCI Adult dataset (Section 5); Logistic Regression on the USPS odd vs even dataset (Section 5); MNIST and CIFAR-10, neural networks. ... for 10-way classification on MNIST [32] with MLPs and 10-way classification on CIFAR-10 with CifarNet [62] (Section 5)
Dataset Splits Yes Validation acc (%) (Figure 1, 2, 3 labels); For the Add Data task, the base model uses 9% of the data and we add 1% new data. (Section 5)
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies No The paper mentions software components like 'L-BFGS optimizer', 'Adam optimizer', 'MLP', and 'CifarNet', but it does not specify version numbers for these or other relevant software libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes For training, we use the L-BFGS optimizer for logistic regression with polynomial basis. (Section 5); For the Change Regularizer task, we change the L2 regularizer from δ = 50 to 5 (Section 5); the Change Architecture task compresses the architecture from a 2-hidden-layer MLP (100 units per layer) to a 1-hidden-layer MLP with 100 units. (Section 5)