Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Weakly-Supervised Hierarchical Text Classification
Authors: Yu Meng, Jiaming Shen, Chao Zhang, Jiawei Han6826-6833
AAAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on three datasets from different domains demonstrate the efficacy of our method compared with a comprehensive set of baselines. |
| Researcher Affiliation | Academia | Yu Meng, Jiaming Shen, Chao Zhang, Jiawei Han University of Illinois at Urbana-Champaign, Urbana, IL, USA EMAIL |
| Pseudocode | Yes | Algorithm 1: Overall Network Training. |
| Open Source Code | No | The paper does not provide a direct link to its own open-source code or explicitly state that its code is available. |
| Open Datasets | Yes | Yelp Review: We use the Yelp Review Full dataset (Zhang, Zhao, and Le Cun 2015) and take its testing portion as our dataset. |
| Dataset Splits | No | The paper mentions pre-training and self-training processes but does not specify a distinct validation set or its split for hyperparameter tuning or model selection. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using a 'Skip-Gram model' and 'CNN model' but does not provide specific version numbers for any software dependencies, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For all datasets, we use Skip-Gram model (Mikolov et al. 2013) to train 100-dimensional word embeddings... we use CNN model with one convolutional layer as local classifiers. Specifically, the filter window sizes are 2, 3, 4, 5 with 20 feature maps each. Both the pre-training and the self-training steps are performed using SGD with batch size 256. |