lpNTK: Better Generalisation with Less Data via Sample Interaction During Learning
Authors: Shangmin Guo, Yi Ren, Stefano V Albrecht, Kenny Smith
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We run our experiment using the Res Net-18 model (He et al., 2016) and a subset of the CIFAR-10 dataset (Krizhevsky et al., 2009): we select 4096 samples randomly from all the ten classes, giving a set of 40960 samples in total. On this dataset, we first track the learning difficulty of samples through a single training run of the model. Then, we randomly split it into 4096 X subsets where X {1, 4, 16, 256, 1024}, and train a model on each of these subsets. |
| Researcher Affiliation | Academia | University of Edinburgh, University of British Columbia, |
| Pseudocode | Yes | Algorithm 1: Correlation between Learning Difficulty on Size N and Size X; Algorithm 2: Predict forgetting events with a variant of lp NTK κ; Algorithm 3: Farthest Point Clustering with lp NTK |
| Open Source Code | No | The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We run our experiment using the Res Net-18 model (He et al., 2016) and a subset of the CIFAR-10 dataset (Krizhevsky et al., 2009): we select 4096 samples randomly from all the ten classes, giving a set of 40960 samples in total. |
| Dataset Splits | No | The paper mentions using a 'validation set' to select the best parameters: '1) fit the model on a given benchmark, and select the parameters w which has the best performance on validation set'. However, it does not provide specific details on the size, proportion, or methodology of this validation split for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | The paper mentions models like 'Res Net-18' and 'Le Net-5' and optimization algorithms like 'SGD' but does not specify version numbers for any software libraries or frameworks (e.g., PyTorch, TensorFlow, scikit-learn) used in the implementation. |
| Experiment Setup | Yes | In all runs, we use the same hyperparameters and train the network with the same batch size. ... this setting runs on MNIST with N = 4096, X {1, 4, 16, 64, 256, 1024}, learning rate is set to 0.1, and batch size is 128 ... this setting runs on MNIST with N = 4096, X {1, 4, 16, 64, 256, 1024}, learning rate is set to 0.001, and batch size is 256 |