Partial Multi-Label Learning with Probabilistic Graphical Disambiguation
Authors: Jun-Yi Hang, Min-Ling Zhang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments on multiple synthetic and real-world data sets show that our approach outperforms the state-of-the-art counterparts. |
| Researcher Affiliation | Academia | Jun-Yi Hang, Min-Ling Zhang School of Computer Science and Engineering, Southeast University, Nanjing 210096, China Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, China {hangjy, zhangml}@seu.edu.cn |
| Pseudocode | Yes | Algorithm 1 Pseudocode of the Optimization Procedure for PARD |
| Open Source Code | Yes | 4Code package of PARD is publicly available at http://palm.seu.edu.cn/zhangml/files/PARD.rar. |
| Open Datasets | Yes | For comprehensive performance evaluation, five real-world and a number of synthetic PML data sets are employed in this paper. Table 1 summarizes detailed characteristics of each data set. Specifically, the first five data sets are real-world PML data sets... While the last five data sets, including corel5k, rcv1-s1, Corel16k-s1, iaprtc12 and espgame, are multi-label data sets. 1 http://palm.seu.edu.cn/zhangml/ 2 http://mulan.sourceforge.net/datasets.html 3 http://lear.inrialpes.fr/people/guillaumin/data.php |
| Dataset Splits | Yes | Following [37], we take out 10% examples in each data set as hold-out validation set... The remaining 90% examples are randomly splitted into training set and test set with a ratio of 9:1 for training and evaluation repectively. |
| Hardware Specification | Yes | In this paper, all experiments are conducted on one V100 GPU. |
| Software Dependencies | No | All the distributions involved in Eq. (3) are instantiated as multivariate Bernoulli distributions and are parameterized by neural networks. ... For network optimization, Adam with a batch size of 128, weight decay of 10 4, momentums of 0.999 and 0.9 is employed. |
| Experiment Setup | Yes | the hidden dimensionalities are set to [256; 512; 256] and [256; 512] respectively. For fair comparison with existing PML approaches, the prediction model is implemented as a linear model. To compute the objective function in Eq. (3), a trade-off parameter α is introduced for the KL-divergence term and Monte Carlo sampling with sampling number L = 1 is conducted to estimate the first expectation term, where the temperature parameter τ = 2/3 as suggested by [27]. In the following experiments, we set α 1 so that the objective function is still a valid lower bound of the data log-likelihood. For network optimization, Adam with a batch size of 128, weight decay of 10 4, momentums of 0.999 and 0.9 is employed. In this paper, all experiments are conducted on one V100 GPU. |