Towards Automated Semi-Supervised Learning

Authors: Yu-Feng Li, Hai Wang, Tong Wei, Wei-Wei Tu4237-4244

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive empirical results over 200 cases demonstrate that our proposal on one side achieves highly competitive or better performance compared to the state-of-the-art Auto ML system AUTO-SKLEARN and classical SSL techniques
Researcher Affiliation Collaboration 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China 24Paradigm Inc., Beijing, China
Pseudocode No The paper describes methods using prose and mathematical formulas but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement or link indicating that the source code for the proposed AUTO-SSL system is publicly available.
Open Datasets No The paper mentions '40 datasets' and lists examples like 'blood, echocardiogram, cylinder-bands, house-votes, credit-approval, spambase', stating 'Detail information of datasets please refer to the supplementary file.' However, it does not provide a direct link, DOI, repository, or formal citation for these datasets within the paper itself to indicate public availability.
Dataset Splits Yes For each data set, a series of limited labeled instances (20, 40, 60, 80, 100) are considered, where labeled data are randomly chosen, and the remaining data are used as unlabeled data. Each dataset is split for 20 times and average performance in terms of accuracy and area Under the ROC Curve (AUC) is reported. and we employ K-fold cross-validation (Kohavi 1995) to be an estimate of the model performance. Formally, let D = {{xi, yi}l i=1, {xj}l+u j=l+1} be a training set which is split into K cross-validation folds {D1 train, , DK train} and {D1 valid, , DK valid} such that Di train = Dtrain\Di valid for i = 1, , K.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or cloud instance types) used for running its experiments.
Software Dependencies No The paper mentions software like WEKA and AUTO-SKLEARN but does not provide specific version numbers for these or any other software dependencies used in their experimental setup.
Experiment Setup Yes SVM: Its hyperparameter, the penalty factor Csvm is selected from 7 configurations {2 3, 2 2, 2 1, 20, 21, 22, 23}. ... CMN: ... the value of k is the hyperparameter selected from 3 configurations {5, 7, 9}. ... TSVM: The penalty factor CT SV M is its hyperparameter selected from the same configurations of SVM. ... The running time is set to one minute which is sufficient to ensure AUTO-SKLEARN system to finish successfully.