An Infinite-Width Analysis on the Jacobian-Regularised Training of a Neural Network

Authors: Taeyoung Kim, Hongseok Yang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally show the relevance of our theoretical claims to wide finite networks, and empirically analyse the properties of kernel regression solution to obtain an insight into Jacobian regularisation.
Researcher Affiliation Academia Taeyoung Kim 1 Hongseok Yang 1 ... 1School of Computing, KAIST, Daejeon, South Korea. Correspondence to: Hongseok Yang <hongseok.yang@kaist.ac.kr>.
Pseudocode No No structured pseudocode or algorithm blocks with explicit labels like 'Algorithm' or 'Pseudocode' were found in the paper.
Open Source Code No The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include links to a code repository or mention code in supplementary materials.
Open Datasets Yes For the third result regarding the evolution of a JNTK during robust training (Theorem 4.5), we used the Algerian forest fire dataset (Abid and Izeboudjen, 2020) from the UCI Machine Learning repository... We include two additional datasets in the appendix, the banknote authentication dataset (Lohweg, 2013)... and the connectionist bench dataset (Sejnowski and Gorman, 2017).
Dataset Splits No The paper uses datasets such as 'simple synthetic dataset in R4 of size 256' and 'Algerian forest fire dataset... which contains 224 data points', but it does not specify explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined splits with citations).
Hardware Specification Yes For our experiments, we utilised 3 NVIDIA RTX 6000 GPUs to validate the convergence of the GP over three days. Additionally, we employed 3 NVIDIA RTX A5000 GPUs to verify the constancy of finite JNTK during training for one day. All other experiments were performed on CPUs, each completed in under an hour.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library names like PyTorch or TensorFlow with their respective versions) that would be needed to reproduce the experiments.
Experiment Setup Yes In all of these tests, we set κ to 0.1... We set the values of λ to 0.01, κ to 0.1, and the learning rate to 1. These specific values were selected to ensure that the test accuracy approached 100% by the end of 2048 epochs.