reproducibilityindex.ai

Hierarchical Attention Transfer Network for Cross-Domain Sentiment Classification

Authors: Zheng Li, Ying Wei, Yu Zhang, Qiang Yang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the Amazon review dataset demonstrate the effectiveness of HATN. Table 2 reports the classiﬁcation accuracies of different methods on the Amazon reviews dataset.
Researcher Affiliation	Academia	Zheng Li, Ying Wei, Yu Zhang, Qiang Yang Hong Kong University of Science and Technology, Hong Kong zlict@cse.ust.hk, yweiad@gmail.com, yu.zhang.ust@gmail.com, qyang@cse.ust.hk
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement regarding the release of source code or a link to a code repository for the described methodology.
Open Datasets	Yes	We conduct the experiments on the Amazon reviews dataset (Blitzer, Dredze, and Pereira 2007), which has been widely used for cross-domain sentiment classiﬁcation.
Dataset Splits	Yes	For each pair A B, we randomly choose 2800 positive and 2800 negative reviews from the source domain A as the training data, the rest from the source domain A as the validation data, and all labeled reviews (6000) from the target domain B for testing. We perform early stopping on the validation set during the training process.
Hardware Specification	No	The paper does not specify the hardware used for running experiments (e.g., GPU/CPU models, memory).
Software Dependencies	No	The paper mentions 'NLTK' and 'word2vec vectors' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	The memory size nc and nw are set to 20 and 25 respectively. We use the public 300-dimensional word2vec vectors with the skip-gram model (Mikolov et al. 2013) to initialize the embedding matrix L. ... The hidden dimensions of the word attention layer and sentence attention layer are 300. ... The regularization weight ρ is set to 0.005. ... we use a batch size bs =50 for the sentiment classiﬁer, a batch size bd =100 for the domain classiﬁer... Gradients with the ℓ2 norm larger than 40 are normalized to be 40. ... T is set to 100. The learning rate is decayed as η= max( 0.005 (1+10p)0.75 , 0.002) and the adaptation rate is increased as λ= min( 2 1+exp( 10p) 1, 0.1) during training.