Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Conditional Bernoulli Mixtures for Multi-label Classification
Authors: Cheng Li, Bingyu Wang, Virgil Pavlu, Javed Aslam
ICML 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show the effectiveness of the proposed method against competitive alternatives on benchmark datasets. We perform experiments on five commonly used and relatively large multi-label datasets: SCENE, TMC2007, MEDIAMILL, NUS-WIDE from Mulan1 and RCV1 (topics subset 1) from LIBSVM2. |
| Researcher Affiliation | Academia | Cheng Li EMAIL Bingyu Wang EMAIL Virgil Pavlu EMAIL Javed Aslam EMAIL College of Computer and Information Science, Northeastern University, Boston, MA 02115, USA |
| Pseudocode | Yes | Algorithm 1 Generic Training for CBM and Algorithm 2 Prediction by Dynamic Prog. and Pruning |
| Open Source Code | Yes | Our implementations of CBM and several baselines (Pow Set, PCC, CRF, etc.) are available at https://github.com/ cheng-li/pyramid. |
| Open Datasets | Yes | We perform experiments on five commonly used and relatively large multi-label datasets: SCENE, TMC2007, MEDIAMILL, NUS-WIDE from Mulan1 and RCV1 (topics subset 1) from LIBSVM2. 1http://mulan.sourceforge.net 2https://www.csie.ntu.edu.tw/~cjlin/ libsvmtools/datasets/multilabel.html |
| Dataset Splits | Yes | For the sake of reproducibility, we adopt the train/test splits provided by Mulan and LIBSVM. Hyper parameter tuning is done by cross-validation on the training set (see the supplementary material for details). |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, memory, or cloud instance types used for the experiments. |
| Software Dependencies | No | The paper mentions software like logistic regressions, gradient boosted trees, L-BFGS, MEKA, LIBSVM, and Python, but does not specify any version numbers for these or other software dependencies. |
| Experiment Setup | Yes | To avoid over-fitting, we also add L2 regularizations (Gaussian priors) to all parameters. Hyper parameter tuning is done by cross-validation on the training set (see the supplementary material for details). |