Knowledge-Adaptation Priors
Authors: Mohammad Emtiyaz Khan, Siddharth Swaroop
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that adaptation with K-priors achieves performance similar to full retraining, but only requires training on a handful of past examples. (Abstract) |
| Researcher Affiliation | Academia | Mohammad Emtiyaz Khan RIKEN Center for AI Project Tokyo, Japan emtiyaz.khan@riken.jp Siddharth Swaroop University of Cambridge Cambridge, UK ss2163@cam.ac.uk |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. Methods are described in prose and mathematical equations. |
| Open Source Code | Yes | Code is available at https://github.com/team-approx-bayes/kpriors. (Section 1) |
| Open Datasets | Yes | Results on binary classification on USPS digits (Figure 1 caption); Logistic Regression on the UCI Adult dataset (Section 5); Logistic Regression on the USPS odd vs even dataset (Section 5); MNIST and CIFAR-10, neural networks. ... for 10-way classification on MNIST [32] with MLPs and 10-way classification on CIFAR-10 with CifarNet [62] (Section 5) |
| Dataset Splits | Yes | Validation acc (%) (Figure 1, 2, 3 labels); For the Add Data task, the base model uses 9% of the data and we add 1% new data. (Section 5) |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions software components like 'L-BFGS optimizer', 'Adam optimizer', 'MLP', and 'CifarNet', but it does not specify version numbers for these or other relevant software libraries (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | For training, we use the L-BFGS optimizer for logistic regression with polynomial basis. (Section 5); For the Change Regularizer task, we change the L2 regularizer from δ = 50 to 5 (Section 5); the Change Architecture task compresses the architecture from a 2-hidden-layer MLP (100 units per layer) to a 1-hidden-layer MLP with 100 units. (Section 5) |