reproducibilityindex.ai

Label Confusion Learning to Enhance Text Classification Models

Authors: Biyang Guo, Songqiao Han, Xiao Han, Hailiang Huang, Ting Lu12929-12936

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on ﬁve text classiﬁcation benchmark datasets reveal the effectiveness of LCM for several widely used deep learning classiﬁcation models. Further experiments also verify that LCM is especially helpful for confused or noisy datasets and superior to the label smoothing method.
Researcher Affiliation	Academia	AI Lab, School of Information Management and Engineering, Shanghai University of Finance and Economics, Shanghai, China, 200433 guobiyang2020@gmail.com, {han.songqiao, xiaohan, hlhuang}@shufe.edu.cn, luting@189.cn
Pseudocode	No	The paper describes the model architecture and mathematical formulations but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or link for the open-source code of the described methodology.
Open Datasets	Yes	The 20NG dataset1 (bydata version) is an English news dataset that contains 18846 documents evenly categorized into 20 different categories. 1https://www.cs.umb.edu/ smimarog/textmining/datasets/ The AG s News dataset2 is constructed by Xiang Zhang (Zhang, Zhao, and Le Cun 2015) which contains 127600 samples with 4 classes. 2http://www.di.unipi.it/ gulli The DBPedia dataset3 is also created by Xiang Zhang (Zhang, Zhao, and Le Cun 2015). 3http://dbpedia.org The FDCNews dataset4 is provided by Fudan University which contains 9833 Chinese news categorized into 20 different classes. 4http://www.nlpir.org The THUCNews dataset5 is a Chinese news classiﬁcation dataset collected by Tsinghua University. 5http://thuctc.thunlp.org
Dataset Splits	Yes	Most of the datasets have already been split into train and test set. However the different split can directly affect the ﬁnal performance of the model. Therefore, in our experiments, we combine the separated train and test set to one dataset and randomly split them to different train and test set 10 times by splitting ratio of 7:3.
Hardware Specification	Yes	The model is implemented using Keras and is trained on GPU Ge Force GTX 1070 Ti.
Software Dependencies	No	The paper states 'The model is implemented using Keras' but does not provide specific version numbers for Keras or any other software dependencies.
Experiment Setup	Yes	Settings For LSTM we set embedding size and hidden size as 64. For CNN, we use 3 ﬁlters with size 3, 10 and 25 and the number of ﬁlters for each convolution block is 100. For both LSTM and CNN models, the embedding size is 64 if no pre-trained word embedding are used. Otherwise, the embedding size is 250 for Chinese tasks and 100 for English tasks. ... In our main experiments we just set α = 4 as a moderate choice. ... We train our model s parameters with the Adam Optimizer (Kingma and Ba 2014) with an initial learning rate of 0.001 and batch size of 128.