Label-Specific Feature Augmentation for Long-Tailed Multi-Label Text Classification

Authors: Pengyu Xu, Lin Xiao, Bing Liu, Sijin Lu, Liping Jing, Jian Yu

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments on benchmark datasets have shown that the proposed LSFA outperforms the state-of-the-art counterparts.Experiments Experimental Setup Datasets There are several MLTC datasets. We evaluate the proposed model on three benchmark datasets of them, which are AAPD (Yang et al. 2018), RCV1 (Lewis et al. 2004) and EUR-Lex (Loza Menc ıa and F urnkranz 2008). Table 1 contains the statistics of these three benchmark datasets.
Researcher Affiliation Academia Pengyu Xu, Lin Xiao, Bing Liu, Sijin Lu, Liping Jing*, Jian Yu Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, China {pengyu, 17112079, 22120391, 22120406, lpjing, jianyu}@bjtu.edu.cn
Pseudocode No The paper describes its method using equations and textual descriptions, but it does not include explicit pseudocode or algorithm blocks.
Open Source Code Yes Our code and hyper-parameter settings are publicly available at https://github.com/stxupengyu/LSFA.
Open Datasets Yes We evaluate the proposed model on three benchmark datasets of them, which are AAPD (Yang et al. 2018), RCV1 (Lewis et al. 2004) and EUR-Lex (Loza Menc ıa and F urnkranz 2008).
Dataset Splits No The paper mentions 'Ntrn' for training documents and 'Ntst' for test documents in Table 1, but does not explicitly detail a validation split or its size.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory).
Software Dependencies No The paper mentions training with Adam optimizer but does not specify any software dependencies with version numbers (e.g., specific libraries or frameworks like PyTorch or TensorFlow, along with their versions).
Experiment Setup Yes Our model was trained by Adam (Kingma and Ba 2014) with the learning rate of 1e-3. We also used stochastic weight averaging (You et al. 2019) with a constant learning rate to enhance the performance. We empirically set the α = 0.1, γ = 1 for balancing the loss . As for the key hyper-parameters of our proposed method: head-to-tail threshold Nt and times of augmentation Na, we set Nt = 1000, Na = 500 for AAPD. For RCV1 and EUR-Lex, we set Nt = 500, Na = 200 and Nt = 50, Na = 10 respectively.