reproducibilityindex.ai

CASE: Context-Aware Semantic Expansion

Authors: Jialong Han, Aixin Sun, Haisong Zhang, Chenliang Li, Shuming Shi7871-7878

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On a dataset of 1.8 million sentences thus derived, we propose a network architecture that encodes the context and seed term separately before suggesting alternative terms. Our experiments demonstrate that competitive results are achieved with appropriate choices of context encoder and attention scoring function.
Researcher Affiliation	Collaboration	Jialong Han,1 Aixin Sun,2 Haisong Zhang,3 Chenliang Li,4 Shuming Shi3 1Amazon, USA, 2Nanyang Technological University, Singapore, 3Tencent AI Lab, China, 4Wuhan University, China
Pseudocode	No	The paper includes a network architecture diagram (Figure 2) but does not provide any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	Speciﬁcally, we employ an existing web-scale dataset, Web Is A2 (Seitner et al. 2016), to derive large-scale annotated sentences.
Dataset Splits	No	The paper states 'From them, we sample 20% of sentences to form the test set, and use the remainder for training.' It does not explicitly define a separate validation split or its size/percentage.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, processor types, or memory used for running the experiments.
Software Dependencies	No	The paper mentions using 'cbow' for pre-training and 'Adam optimizer', and links to TensorFlow's sampled softmax loss API documentation, but it does not provide specific version numbers for any software libraries or dependencies.
Experiment Setup	Yes	We trim or pad all contexts to length 100, and treat words occurring less than 5 times as OOVs. Word vectors are pretrained with cbow (Mikolov et al. 2013). Their dimensions d as well as encoded contexts and seeds are set to 100. The intermediate dimension d of attention-related network is set to 10. Each batch is of size 128 with 1,000 negative samples to compose the sampled candidates. We iterate for 10 epoches with the Adam optimizer.