Non-Linear Similarity Learning for Compositionality

Authors: Masashi Tsubaki, Kevin Duh, Masashi Shimbo, Yuji Matsumoto

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method in the Sem Eval 2014 semantic relatedness task, which uses the Sentences Involving Compositional Knowledge (SICK) dataset (Marelli et al. 2014). The task is to predict the relatedness of two sentences... Models are evaluated by computing Pearson s r, and Spearman s ρ correlations, and Mean Squared Error (MSE) between the gold similarity scores and scores predicted with the models. Table 1 shows r, ρ, and MSE for different composition and kernel models.
Researcher Affiliation Academia Masashi Tsubaki, Kevin Duh*, Masashi Shimbo, Yuji Matsumoto Nara Institute of Science and Technology, *Johns Hopkins University {masashi-t,shimbo,matsu}@is.naist.jp, *kevinduh@cs.jhu.edu
Pseudocode No The paper describes its methods through mathematical formulations and textual explanations but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We evaluate our method in the Sem Eval 2014 semantic relatedness task, which uses the Sentences Involving Compositional Knowledge (SICK) dataset (Marelli et al. 2014). The dataset consists of 9927 sentence pairs in a 4500/500/4927 train/dev/test split. (footnote 3: http://alt.qcri.org/semeval2014/task1/)
Dataset Splits Yes The dataset consists of 9927 sentence pairs in a 4500/500/4927 train/dev/test split.
Hardware Specification No The paper does not provide specific details regarding the hardware used for running the experiments, such as CPU or GPU models.
Software Dependencies No The paper mentions using 'Enju' for predicate argument structure and 'Ada Grad' for optimization, but it does not specify version numbers for these or any other software dependencies, making the setup not fully reproducible in terms of software environment.
Experiment Setup Yes We used 50-dimensional word representation, with several different initializations: random initialization within ( 0.1, 0.1), LSA, NLM, and Glo Ve. ... We initialized all weight matrices as W = I + ϵ, where ϵ is a matrix of small Gaussian noise, and parameters in kernels as c = 1.0 in polynomial and σ = 1.0 in RBF. Our models were trained using the adaptive gradient method Ada Grad (Duchi, Hazan, and Singer 2011) with learning rates α = 0.5 for the word representations, β = 10 2 for the weight matrix, and γ = 10 3 for parameters in kernels, with the regularization parameter λ = 10 6. These hyperparameters (α, β, γ, λ) for our models were tuned on the development set.