Conceptualized and Contextualized Gaussian Embedding
Authors: Chen Qian, Fuli Feng, Lijie Wen, Tat-Seng Chua13683-13691
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on intrinsic and extrinsic tasks demonstrate the effectiveness of the proposed approach, achieving state-of-the-art performance with near 5.00% relative improvement. We conduct extensive experiments on two intrinsic tasks (word similarity and word entailment) and three types of extrinsic tasks (single sentence tagging, single sentence classification and sentence pair classification). The results show that our approach consistently outperforms state-of-the-art methods, which validates the effectiveness of the learned conceptualized and contextualized Gaussian representations. |
| Researcher Affiliation | Academia | Chen Qian1, Fuli Feng2 , Lijie Wen1 , Tat-Seng Chua2 1 School of Software, Tsinghua University, Beijing, China 2 Scholar of Computing, National University of Singapore, Singapore qc16@mails.tsinghua.edu.cn, fulifeng93@gmail.com, wenlj@tsinghua.edu.cn, chuats@comp.nus.edu.sg |
| Pseudocode | No | The paper describes its models and methods textually and mathematically but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | Yes | We train on a concatenation of two English datasets: UKWAC and Wackypedia (Baroni et al. 2009), which consists of 3.3 billion tokens. For word similarity (SIM), we evaluate on multiple standard word similarity datasets: MC (Miller and Charles 1991), MEN (Bruni, Tran, and Baroni 2014), RG (Rubenstein and Goodenough 1965), RW (Luong, Socher, and Manning 2013), SL (Hill, Reichart, and Korhonen 2015), YP (Yang and Powers 2006) and SCWS Huang et al. (2012). For word entailment (ENT), we evaluate on the standard word entailment dataset (SED) (Baroni et al. 2012). |
| Dataset Splits | No | The paper mentions using standard datasets like CoNLL-2003, WeBis, and RTE-5 for evaluation, which typically have predefined splits. However, it does not explicitly state the training, validation, and test split percentages or sample counts used in its experiments within the text. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions using 'Bi LSTMs' and 'Adagrad optimizer' but does not specify any software libraries or frameworks with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, Python 3.x). |
| Experiment Setup | Yes | The fixed hyperparameters include an embedding dimension D=300, a margin m=1, the layer of Bi LSTM in GIANT L=2 and a batch size of 128. We also experiment with a linearly decreasing weight α from 1.0 to 0.9 and Adagrad optimizer with a dynamic learning rate from 0.05 to 0.00001. Additionally, following Athiwaratkun and Wilson (2017), we use the diagonal covariances to reduce computation complexity from O(D3) to O(D). |