reproducibilityindex.ai

Non-Linear Similarity Learning for Compositionality

Authors: Masashi Tsubaki, Kevin Duh, Masashi Shimbo, Yuji Matsumoto

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method in the Sem Eval 2014 semantic relatedness task, which uses the Sentences Involving Compositional Knowledge (SICK) dataset (Marelli et al. 2014). The task is to predict the relatedness of two sentences... Models are evaluated by computing Pearson s r, and Spearman s ρ correlations, and Mean Squared Error (MSE) between the gold similarity scores and scores predicted with the models. Table 1 shows r, ρ, and MSE for different composition and kernel models.
Researcher Affiliation	Academia	Masashi Tsubaki, Kevin Duh, Masashi Shimbo, Yuji Matsumoto Nara Institute of Science and Technology, Johns Hopkins University {masashi-t,shimbo,matsu}@is.naist.jp, *kevinduh@cs.jhu.edu
Pseudocode	No	The paper describes its methods through mathematical formulations and textual explanations but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We evaluate our method in the Sem Eval 2014 semantic relatedness task, which uses the Sentences Involving Compositional Knowledge (SICK) dataset (Marelli et al. 2014). The dataset consists of 9927 sentence pairs in a 4500/500/4927 train/dev/test split. (footnote 3: http://alt.qcri.org/semeval2014/task1/)
Dataset Splits	Yes	The dataset consists of 9927 sentence pairs in a 4500/500/4927 train/dev/test split.
Hardware Specification	No	The paper does not provide specific details regarding the hardware used for running the experiments, such as CPU or GPU models.
Software Dependencies	No	The paper mentions using 'Enju' for predicate argument structure and 'Ada Grad' for optimization, but it does not specify version numbers for these or any other software dependencies, making the setup not fully reproducible in terms of software environment.
Experiment Setup	Yes	We used 50-dimensional word representation, with several different initializations: random initialization within ( 0.1, 0.1), LSA, NLM, and Glo Ve. ... We initialized all weight matrices as W = I + ϵ, where ϵ is a matrix of small Gaussian noise, and parameters in kernels as c = 1.0 in polynomial and σ = 1.0 in RBF. Our models were trained using the adaptive gradient method Ada Grad (Duchi, Hazan, and Singer 2011) with learning rates α = 0.5 for the word representations, β = 10 2 for the weight matrix, and γ = 10 3 for parameters in kernels, with the regularization parameter λ = 10 6. These hyperparameters (α, β, γ, λ) for our models were tuned on the development set.