reproducibilityindex.ai

Efficient Label Contamination Attacks Against Black-Box Learning Models

Authors: Mengchen Zhao, Bo An, Wei Gao, Teng Zhang

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical studies show that PGA signiﬁcantly outperforms existing baselines and linear learning models are better substitute models than nonlinear ones.
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Nanyang Technological University, Singapore 2National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Pseudocode	Yes	Algorithm 1: Projected Gradient Ascent (PGA) and Algorithm 2: Flip strategy
Open Source Code	No	The paper discusses implementing with LIBSVM [Chang and Lin, 2011] and LINLINEAR [Fan et al., 2008], which are third-party tools. It does not provide concrete access to the source code for their own methodology.
Open Datasets	Yes	We will use ﬁve public data sets: Australian (690 points, 14 features), W8a (10000 points, 300 features), Spambase (4601 points, 57 features) [Lichman, 2013], Wine (130 points, 14 features) and Skin (5000 points, 3 features) 1. 1Except Spambase, all data sets can be downloaded from https://www.csie.ntu.edu.tw/~cjlin/ libsvmtools/datasets/.
Dataset Splits	No	The paper mentions 'training data' and 'test set' but does not provide specific details about validation data splits or a methodology for model selection that explicitly uses a validation set.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) were provided for running the experiments.
Software Dependencies	Yes	All training processes are implemented with LIBSVM [Chang and Lin, 2011] and LINLINEAR [Fan et al., 2008]. The DT, KNN and NB models are trained using MATLAB R2016b Statics and Machine Learning Toolbox and all parameters are set by default.
Experiment Setup	Yes	We set the regularization parameter C=1 for all ﬁve models. We set the parameters d=2 for polynomial kernel and γ=0.1 for RBF kernel. All attacks computed by PGA are the best among 50 runs. We set the attacker s budget as 30% of the training points.