A New Method of Region Embedding for Text Classification
Authors: chao qiao, bo huang, guocheng niu, daren li, daxiang dong, wei he, dianhai yu, hua wu
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our proposed method outperforms existing methods in text classification on several benchmark datasets. The results also indicate that our method can indeed capture the salient phrasal expressions in the texts. |
| Researcher Affiliation | Industry | Baidu Inc., Beijing, China National Engineering Laboratory of Deep Learning Technology and Application, China {qiaochao, huangbo02, niuguocheng, lidaren, daxiangdong, hewei06, yudianhai, wu hua}@baidu.com |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | The code 1 is publicly available on the Internet. 1https://github.com/text-representation/local-context-unit |
| Open Datasets | Yes | We use publicly available datasets from Zhang et al. (2015) to evaluate our models. |
| Dataset Splits | Yes | For our models, optimal hyperparameters are tuned with 10% of the training set on Yelp Review Full dataset, and identical hyperparameters are applied to all datasets. |
| Hardware Specification | Yes | Algorithms are entirely implemented with TensorFlow and trained on NVIDIA Tesla P40 GPUs. |
| Software Dependencies | No | The paper states 'Algorithms are entirely implemented with TensorFlow' but does not specify a version number for TensorFlow or any other software dependencies with their versions. |
| Experiment Setup | Yes | For our models, optimal hyperparameters are tuned with 10% of the training set on Yelp Review Full dataset, and identical hyperparameters are applied to all datasets: the dimension of word embedding is 128, the region size is 7 which means the shape of local context unit matrix of each word is 128 7, the initial learning rate is set to 1 10 4, and the batch size is 16. For optimization, the embeddings of words and the units are randomly initialized with Gaussian Distribution. Adam (Kingma & Ba, 2014) is used as the optimizer. We do not use any extra regularization methods, like L2 normalization or dropout. |