The Recurrent Neural Tangent Kernel

Authors: Sina Alemohammad, Zichao Wang, Randall Balestriero, Richard Baraniuk

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A synthetic and 56 real-world data experiments demonstrate that the RNTK offers significant performance gains over other kernels, including standard NTKs, across a wide array of data sets.
Researcher Affiliation Academia Sina Alemohammad, Zichao Wang, Randall Balestriero, Richard G. Baraniuk Department of Electrical and Computer Engineering Rice University {sa86,zw16,rb42,richb}@rice.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link confirming the release of the open-source code for the methodology described.
Open Datasets Yes The UCR time series classification data repository (Dau et al., 2019). ... URL https://www.cs.ucr.edu/~eamonn/time_series_data_ 2018/.
Dataset Splits Yes For training we used C-SVM in LIBSVM library (Chang & Lin, 2011) and for hyperparameter selection we performed 10-fold validation for splitting the training data into 90% training set and 10% validation test.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions using the 'LIBSVM library' and the 'RMSProp algorithm' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes For C-SVM we chose the cost function value C {0.01, 0.1, 1, 10, 100} and for each kernel we used the following hyperparameter sets... RNTK: σw {1.34, 1.35, 1.36, 1.37, 1.38, 1.39, 1.40, 1.41, 1.42, 2, 1.43, 1.44, 1.45, 1.46, 1.47} σu = 1 σb {0, 0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.7, 0.9, 1, 2} σh {0, 0.01, 0.1, 0.5, 1}. Finite-width RNN settings. We used 3 different RNNs. ... All models are trained with RMSProp algorithm for 200 epochs. Early stopping is implemented when the validation set accuracy does not improve for 5 consecutive epochs. ... number of layer, number of hidden units and learning rate as L {1, 2} n {50, 100, 200, 500} η {0.01, 0.001, 0.0001, 0.00001}.