Implicit Kernel Attention

Authors: Kyungwoo Song, Yohan Jung, Dongjun Kim, Il-Chul Moon9713-9721

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our generalized attention shows better performance on classification, translation, and regression tasks.
Researcher Affiliation Academia Kyungwoo Song1, Yohan Jung2, Dongjun Kim2, Il-Chul Moon*2 1 Department of AI, University of Seoul 2 Department of ISys E, Korea Advanced Institute of Science and Technology (KAIST) kyungwoo.song@uos.ac.kr, {becre1776,dongjoun57,icmoon}@kaist.ac.kr
Pseudocode No The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code No The paper does not provide any explicit statement about releasing source code for the methodology described, nor does it include a link to a code repository or mention code in supplementary materials.
Open Datasets Yes We compare our models with the scaled dot-product attention in Transformer (Vaswani et al. 2017), and RBF-only (Tsai et al. 2019) on five popular datasets (Kim 2014; Wang et al. 2020).
Dataset Splits Yes We perform ten-fold cross validations by following the experimental settings (Wang et al. 2020).
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library names like PyTorch or TensorFlow, along with their versions) needed to replicate the experiments.
Experiment Setup No The paper mentions following experimental settings from other papers (e.g., 'by following the experimental settings (Wang et al. 2020)') but does not explicitly provide concrete hyperparameter values, training configurations, or system-level settings within its main text.