Unsupervised Word and Dependency Path Embeddings for Aspect Term Extraction

Authors: Yichun Yin, Furu Wei, Li Dong, Kaimeng Xu, Ming Zhang, Ming Zhou

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on the Sem Eval datasets show that, (1) with only embedding features, we can achieve state-of-the-art results; (2) our embedding method which incorporates the syntactic information among words yields better performance than other representative ones in aspect term extraction.
Researcher Affiliation Collaboration Yichun Yin1, Furu Wei2, Li Dong3, Kaimeng Xu1, Ming Zhang1 , Ming Zhou2 1School of EECS, Peking University 2Microsoft Research 3Institute for Language, Cognition and Computation, University of Edinburgh
Pseudocode No The paper describes the model training and feature construction using natural language and mathematical equations, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper provides links to the implementations of *baseline* models used for comparison but does not state that the code for their own proposed methodology is publicly released or provide a link to it.
Open Datasets Yes We conduct our experiments on the Sem Eval 2014 and 2015 datasets. The corpora contain Yelp dataset3 and Amazon dataset4 which are in-domain corpora for restaurant domain and laptop domain respectively. 3https://www.yelp.com/academic dataset 4https://snap.stanford.edu/data/web-Amazon.html
Dataset Splits Yes In order to choose l and d, we use 80% sentences in training data as training set, and the rest 20% as development set.
Hardware Specification No The paper mentions 'asynchronous gradient descent for parallel training' but does not provide specific details on the hardware used, such as GPU/CPU models, memory, or processing units.
Software Dependencies No The paper mentions using 'Stanford corenlp' and 'an available CRF tool' (crfsharp.codeplex.com) but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes The dimensions of word and dependency path embeddings are set to 100. Larger dimensions get similar results in the development set but need more training time. l is set to 15 that performs best in the development set. We use asynchronous gradient descent for parallel training. Following the strategy for updating learning rate [Mikolov et al., 2013a], we linearly decrease it over our training instances. The initial learning rate is set to 0.001.