Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Kernel Multimodal Continuous Attention
Authors: Alexander Moreno, Zhenke Wu, Supriya Nagesh, Walter Dempsey, James M. Rehg
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that kernel continuous attention often outperforms unimodal continuous attention, and the sparse variant tends to highlight time series peaks. |
| Researcher Affiliation | Collaboration | Alexander Moreno Luminous Computing Zhenke Wu University of Michigan Supriya Nagesh Georgia Tech Walter Dempsey University of Michigan James M. Rehg Georgia Tech |
| Pseudocode | Yes | Algorithm 1 Continuous Attention Mechanism via Kernel Deformed Exponential Families |
| Open Source Code | Yes | Code is in our repository3, where we discuss the flags used to control precision on recent GPUs and Pytorch versions. 3https://github.com/onenoc/kernel-continuous-attention |
| Open Datasets | Yes | We analyze u Wave [19]: accelerometer time series with eight gesture classes. We follow [16] s split into 3,582 training observations and 896 test observations: sequences have length 945. ... We extend [23] s code5 for IMDB sentiment classification [20]. This uses a document representation v from a convolutional network and an LSTM attention model. ... 5[23] s repository for this dataset is https://github.com/deep-spin/quati |
| Dataset Splits | No | For u Wave, it states 'We follow [16] s split into 3,582 training observations and 896 test observations'. It specifies training and test splits, but no explicit validation split is mentioned with specific counts or percentages. |
| Hardware Specification | Yes | Our UWave and ECG experiments were done on a Titan X GPU, IMDB on a 1080, and Ford A on an A40. |
| Software Dependencies | Yes | As an example, Figure 8 was done on a Titan X with an older version of Pytorch, while Figure 7 was done with an A40 with Pytorch 1.12. |
| Experiment Setup | Yes | For u Wave: 'All methods use 100 attention heads. Gaussian mixture uses 100 components (and thus 300 parameters per head), and kernel methods use 256 inducing points.' For IMDB: 'N: basis functions, I = 10 inducing points, bandwidth 0.01.' |