Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Diversifying Convex Transductive Experimental Design for Active Learning
Authors: Lei Shi, Yi-Dong Shen
IJCAI 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on several benchmark data sets demonstrate that Diversified CTED significantly improves CTED and consistently outperforms the state-of-the-art methods, verifying the effectiveness and advantages of incorporating the proposed diversity regularizer into CTED.6 Experiments Following a same experimental protocol in [Yu et al., 2008], we perform classification experiments on five benchmark data sets to demonstrate the effectiveness of the proposed method (i.e., Diversified CTED) and give analysis on the experimental results. |
| Researcher Affiliation | Academia | Lei Shi1,2 and Yi-Dong Shen1 1State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences 2University of Chinese Academy of Sciences, Beijing 100190, China |
| Pseudocode | Yes | Algorithm 1 The Optimization Algorithm for Diversified CTED |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its source code or a link to a code repository. |
| Open Datasets | Yes | We conduct the experiments on 5 publicly available data sets, including 2 digit recognition data sets (i.e., USPS [Wu and Sch olkopf, 2006] and MNIST [Liu et al., 2010]), 2 text data sets (i.e., Web KB [Wang et al., 2011] and Newsgroup [Yu et al., 2005]) and one face data set (i.e., ORL [Cai et al., 2006]). |
| Dataset Splits | No | The paper describes a training and testing split (50% for candidate set, 50% for prediction) but does not explicitly mention a separate validation dataset split. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using SVM and KNN classifiers but does not provide specific version numbers for any software libraries or dependencies. |
| Experiment Setup | Yes | To fairly compare the above algorithms, we tune the parameters for all these methods from a large range of {10 3, 10 2, ..., 103}. To illustrate the effects of , we fix the number of selected samples as 50. For CTED, we report the best performance it achieves. For our method, i.e., DCTED, we vary the value of in [10 3, 10 2, ..., 103] and report the corresponding results. |