The Kendall and Mallows Kernels for Permutations

Authors: Yunlong Jiao, Jean-Philippe Vert

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we demonstrate promising results of the underlying kernels on a large benchmark of high-dimensional biomedical data classification problems. We investigate the performance of classifying high-dimensional biomedical data. Table 2 and Figure 2 (Left) summarize the performance of each model across the datasets.
Researcher Affiliation Academia MINES Paris Tech CBIO, PSL Research University, Institut Curie, INSERM U900, Paris, France
Pseudocode No No structured pseudocode or algorithm blocks were found.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We investigate the performance of classifying high-dimensional biomedical data... For that purpose, we collected 10 datasets related to human cancer research publicly available online (Li et al., 2003; Schroeder et al., 2011; Shi et al., 2011), as summarized in Table 1.
Dataset Splits Yes Except for three datasets that are split into training and test sets, in which case we report the performance on the test set, we perform a 5-fold cross-validation repeated 10 times and report the mean performance over the 5 * 10 = 50 splits to evaluate the performance of the different methods. In addition, on each training set, an internal 5-fold cross-validation is performed to tune parameters
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are provided.
Software Dependencies No The paper mentions using SVM and KFD as classifiers, but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes For KFD-based models, we add 10^-3 on the diagonal of the centered and scaled kernel matrix, as suggested by (Mika et al., 1999). The C parameter of SVM-based models optimized over a grid ranging from 10^-2 to 10^3 in log scale, and the number k of TSP in case of feature selection (ranging from 1 to 5000 in log scale).