Learning Structured Representation for Text Classification via Reinforcement Learning
Authors: Tianyang Zhang, Minlie Huang, Li Zhao
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments Experimental Setting and Training Details The dimension of hidden state in the representation models is 300. The word vectors are initialized using 300dimensional Glove vectors (Pennington, Socher, and Manning 2014) and are updated together with other parameters. To smooth the update of policy gradient, a suppression factor is multiplied to Eq.2 and is set to 0.1. γ is set to 0.05 K in the reward of ID-LSTM (Eq. 6) and 0.1 K in the reward of HS-LSTM (Eq. 11), where K is the number of categories. During the training process, Adam algorithm (Kingma and Ba 2015) is used to optimize the parameters and the learning rate is 0.0005. We adopted Dropout before the classification layer in CNet, with a probability of 0.5. Mini-batch size is 5. Datasets and Baselines Datasets We evaluated our models on various datasets for sentiment classification, subjectivity analysis, and topic classification. MR: This dataset contains positive/negative reviews (Pang and Lee 2005). SST: Stanford Sentiment Treebank, a public sentiment analysis dataset with five classes (Socher et al. 2013). 2 Subj: Subjectivity dataset. The task is to classify a sentence as subjective or objective (Pang and Lee 2004). AG: AG s news corpus3, a large topic classification dataset constructed by (Zhang, Zhao, and Le Cun 2015). The topic includes World, Sports, Business and Sci/Tech. |
| Researcher Affiliation | Collaboration | Tianyang Zhang, Minlie Huang, , Li Zhao Tsinghua National Laboratory for Information Science and Technology Dept. of Computer Science and Technology, Tsinghua University, Beijing 100084, PR China Microsoft Research Asia keavilzhangzty@gmail.com; aihuang@tsinghua.edu.cn; lizo@microsoft.com; Corresponding Author: aihuang@tsinghua.edu.cn (Minlie Huang) |
| Pseudocode | Yes | Algorithm 1: The Training Process 1 Pre-train the representation model (ID-LSTM or HS-LSTM) and CNet with predefined structures by minimizing Eq. 12; 2 Fix the parameters of the strutured representation model and CNet, and Pre-train PNet by Eq. 2; 3 Train all the three components jointly until convergence; |
| Open Source Code | No | The paper does not provide any statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | Datasets We evaluated our models on various datasets for sentiment classification, subjectivity analysis, and topic classification. MR: This dataset contains positive/negative reviews (Pang and Lee 2005). SST: Stanford Sentiment Treebank, a public sentiment analysis dataset with five classes (Socher et al. 2013). Subj: Subjectivity dataset. The task is to classify a sentence as subjective or objective (Pang and Lee 2004). AG: AG s news corpus3, a large topic classification dataset constructed by (Zhang, Zhao, and Le Cun 2015). The topic includes World, Sports, Business and Sci/Tech. |
| Dataset Splits | No | The paper does not provide specific percentages, sample counts, or explicit mentions of validation set splits, nor does it cite predefined splits that include such details. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or memory specifications used for running experiments. |
| Software Dependencies | No | The paper mentions 'Adam algorithm (Kingma and Ba 2015)' and 'Glove vectors (Pennington, Socher, and Manning 2014)' as tools or resources used, but it does not specify version numbers for any key software components or libraries (e.g., Python, TensorFlow, PyTorch, or other specific packages and their versions). |
| Experiment Setup | Yes | The dimension of hidden state in the representation models is 300. The word vectors are initialized using 300dimensional Glove vectors (Pennington, Socher, and Manning 2014) and are updated together with other parameters. To smooth the update of policy gradient, a suppression factor is multiplied to Eq.2 and is set to 0.1. γ is set to 0.05 K in the reward of ID-LSTM (Eq. 6) and 0.1 K in the reward of HS-LSTM (Eq. 11), where K is the number of categories. During the training process, Adam algorithm (Kingma and Ba 2015) is used to optimize the parameters and the learning rate is 0.0005. We adopted Dropout before the classification layer in CNet, with a probability of 0.5. Mini-batch size is 5. |