reproducibilityindex.ai

A Regularized Framework for Sparse and Structured Neural Attention

Authors: Vlad Niculae, Mathieu Blondel

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To showcase their potential as a drop-in replacement for existing ones, we evaluate our attention mechanisms on three large-scale tasks: textual entailment, machine translation, and sentence summarization. Our attention mechanisms improve interpretability without sacriﬁcing performance; notably, on textual entailment and summarization, we outperform the standard attention mechanisms based on softmax and sparsemax.
Researcher Affiliation	Collaboration	Vlad Niculae Cornell University Ithaca, NY vlad@cs.cornell.edu Mathieu Blondel NTT Communication Science Laboratories Kyoto, Japan mathieu@mblondel.org
Pseudocode	No	The paper describes algorithms and derivations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper states "We build on Open NMT-py [24], based on Py Torch [37]" and "We employ the CPU implementation provided in [31]" but does not explicitly provide a link or statement about releasing their own implemented code for the proposed mechanisms.
Open Datasets	Yes	We use the Stanford Natural Language Inference (SNLI) dataset [8]... we use the standard DUC 2004 dataset ... and a randomly held-out subset of Gigaword, released by [39].
Dataset Splits	No	The paper mentions using standard datasets and following methodologies of other papers ([31], [39]), which imply predefined splits, but does not explicitly provide the specific percentages or sample counts for training/validation/test splits within its main text for all datasets, e.g., for Gigaword, it states 'randomly held-out subset' without specifying its size.
Hardware Specification	No	The paper mentions using 'GPU' for Open NMT-py and 'CPU' for certain operations but does not provide specific hardware details such as GPU/CPU models or memory configurations.
Software Dependencies	No	The paper mentions building on 'Open NMT-py [24], based on Py Torch [37]' but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	To mitigate this effect, we set the tolerance of the solver s stopping criterion to 10 2. While tuning λ may improve performance, we observe that λ = 0.1 (fusedmax) and λ = 0.01 (oscarmax) are sensible defaults that work well across all tasks and report all our results using them.