ACV-tree: A New Method for Sentence Similarity Modeling
Authors: Yuquan Le, Zhi-Jie Wang, Zhe Quan, Jiawei He, Bin Yao
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results, based on 19 widely-used datasets, demonstrate that our model is effective and competitive, compared against state-of-the-art models. |
| Researcher Affiliation | Academia | College of Computer Science and Electronic Engineering, Hunan University, Changsha, China Guangdong Key Lab. of Big Data Anal. and Proc., Sun Yat-Sen University, Guangzhou, China # Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China |
| Pseudocode | Yes | Algorithm 1 COMPSIM(T1,T2) |
| Open Source Code | Yes | Our codes are available at the open-source code repository (https://github. com/yuquanle/Sentence-similarity-modeling.git). |
| Open Datasets | Yes | Following prior works, we conduct experiments on 19 textual similarity datasets (http://ixa.si.ehu.eus/) that contain all the datasets from Semantic Textual Similarity (STS) tasks (2012-2015)... |
| Dataset Splits | No | Each dataset contains many pairs of sentences (e.g. MSRvid dataset contains 750 pairs of sentences). |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | In our experiments, we implement our ACV-tree by using the Stanford Parser [Manning et al., 2017] to generate the constituency tree of the sentence... |
| Experiment Setup | Yes | In our experiments, the hyperparameters µ=[0.1,0.2,...,0.9,1.0], and λ =[0.1,0.2,...,0.9,1.0], where the numbers in bold denote the default settings, unless otherwise stated. Following prior works [Arora et al., 2017; Wang et al., 2017], we use the term frequency-inverse document frequency (TF-IDF) scheme to generate the attention weights. The lexical vectors we used are provided by PARAGRAM-SL999 vectors, which is learned by PPDB and is the 300 dimensional Paragram embeddings tuned on Sim Lex999 dataset [Wieting et al., 2015]. |