reproducibilityindex.ai

Querying Partially Labelled Data to Improve a K-nn Classifier

Authors: Vu-Linh Nguyen, Sbastien Destercke, Marie-Helene Masson

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, some experiments in Section show the effectiveness of our proposals. and This section presents the experimental setup and the results obtained with benchmark data sets which are used to illustrate the behaviour of the proposed schemes.
Researcher Affiliation	Academia	Vu-Linh Nguyen, S ebastien Destercke, Marie-H el ene Masson UMR CNRS 7253 Heudiasyc, Sorbonne Universit e, Universit e de Technologie de Compi egne CS 60319 60203 Compi egne cedex, France {linh.nguyen, sebastien.destercke, mylene.masson}@hds.utc.fr
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access to source code for the described methodology.
Open Datasets	Yes	Results have been obtained for 15 UCI data sets described in Table 6. Three different values for K (3, 6 and 9) have been used for all experiments.
Dataset Splits	Yes	We use a three-fold cross-validation procedure: each data set is randomly split into 3 folds. Each fold is in turn considered as the test set, the other folds are used for the training set.
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	Three different values for K (3, 6 and 9) have been used for all experiments. The weight wt k for an instance t is wt k = 1 (dt k)/( K j=1 dt j) with dt j the Euclidean distance between xt j and t. As usual when working with Euclidean distance based K-nn, data is normalized. ... The training set is contaminated according to one of the models with two combinations of (p, q) parameters:(p = 0.7, q = 0.5) and (p = 0.9, q = 0.9)... For each data set, the number of queries I has been ﬁxed to 10% of the number of training data.