Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Hierarchical Active Learning with Group Proportion Feedback
Authors: Zhipeng Luo, Milos Hauskrecht
IJCAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments on numerous data sets show that our method is competitive and outperforms existing approaches for reducing the human annotation cost. We conduct an empirical study to evaluate our proposed approach on 9 general binary classification data sets collected from UCI machine learning repository [Asuncion and Newman, 2007]. |
| Researcher Affiliation | Academia | Zhipeng Luo and Milos Hauskrecht Department of Computer Science, University of Pittsburgh, PA, USA EMAIL |
| Pseudocode | Yes | Algorithm 1: Our HALG Framework |
| Open Source Code | No | The paper does not provide any statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We conduct an empirical study to evaluate our proposed approach on 9 general binary classification data sets collected from UCI machine learning repository [Asuncion and Newman, 2007]. |
| Dataset Splits | Yes | To run the experiments, we split each data set into three disjoint subsets: the initial labeled data set (about 1%-2% of all available data), a test data set (about 25% of data) and a training data set U (the rest of the data) that is used for training and active learning. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library or solver names with specific versions). |
| Experiment Setup | No | The paper mentions 'Logistic regression' as the model and that 'each label is sampled from 5 to 10 times depending on data sets', but it does not provide specific experimental setup details such as concrete hyperparameter values, optimizer settings, or detailed training configurations for the models. |