CASE: Context-Aware Semantic Expansion

Authors: Jialong Han, Aixin Sun, Haisong Zhang, Chenliang Li, Shuming Shi7871-7878

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On a dataset of 1.8 million sentences thus derived, we propose a network architecture that encodes the context and seed term separately before suggesting alternative terms. Our experiments demonstrate that competitive results are achieved with appropriate choices of context encoder and attention scoring function.
Researcher Affiliation Collaboration Jialong Han,1 Aixin Sun,2 Haisong Zhang,3 Chenliang Li,4 Shuming Shi3 1Amazon, USA, 2Nanyang Technological University, Singapore, 3Tencent AI Lab, China, 4Wuhan University, China
Pseudocode No The paper includes a network architecture diagram (Figure 2) but does not provide any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes Specifically, we employ an existing web-scale dataset, Web Is A2 (Seitner et al. 2016), to derive large-scale annotated sentences.
Dataset Splits No The paper states 'From them, we sample 20% of sentences to form the test set, and use the remainder for training.' It does not explicitly define a separate validation split or its size/percentage.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, processor types, or memory used for running the experiments.
Software Dependencies No The paper mentions using 'cbow' for pre-training and 'Adam optimizer', and links to TensorFlow's sampled softmax loss API documentation, but it does not provide specific version numbers for any software libraries or dependencies.
Experiment Setup Yes We trim or pad all contexts to length 100, and treat words occurring less than 5 times as OOVs. Word vectors are pretrained with cbow (Mikolov et al. 2013). Their dimensions d as well as encoded contexts and seeds are set to 100. The intermediate dimension d of attention-related network is set to 10. Each batch is of size 128 with 1,000 negative samples to compose the sampled candidates. We iterate for 10 epoches with the Adam optimizer.