reproducibilityindex.ai

Label Ranking through Nonparametric Regression

Authors: Dimitris Fotakis, Alkis Kalavasis, Eleni Psaroudaki

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we complement our theoretical contributions with experiments, aiming to understand how the input regression noise affects the observed output.
Researcher Affiliation	Academia	1School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece. Correspondence to: Eleni Psaroudaki <epsaroudaki@mail.ntua.gr>.
Pseudocode	Yes	Algorithm 1 Algorithm of Theorem 2.1; Algorithm 2 Algorithm of Theorem 2.5; Algorithm 3 Level-Splits Algorithm for Score Learning; Algorithm 4 Breiman s Algorithm for Score Learning; Algorithm 5 Algorithms for Label Ranking with Complete Rankings; Algorithm 6 Algorithm for Estimation and Aggregation for a VC class; Algorithm 7 Algorithm for Label Ranking with Incomplete Rankings; Algorithm 8 Algorithm for Label Ranking with Partial Rankings; Algorithm 9 Level-Splits Algorithm; Algorithm 10 Breiman s Algorithm.
Open Source Code	Yes	The code and data sets to reproduce our results are available: https://github.com/pseleni/LR-nonparametric-regression.
Open Datasets	Yes	For the experimental evaluation, two synthetic data set families were used, namely LFN (Large Features Number) and SFN (Small Features Number)... The code for the creation of the Synthetic bencmarks, the Synthetic benchmarks and the standard LR benchmarks that were used in the experimental evaluation can be found in: https://github.com/pseleni/LR-nonparametric-regression. We also evaluate our algorithms on standard Label Ranking Benchmarks. Speciﬁcally on sixteen semi-synthetic data sets and on ﬁve real world LR data sets. The semi-synthetic ones are considered standard benchmarks for the evaluation of LR algorithms, ever since they were proposed in Cheng & H ullermeier (2008). They were created from the transformation of multi-class (Type A) and regression (Type B) data sets from UCI repository and the Statlog collection into Label Ranking data.
Dataset Splits	Yes	For each data set, we run ﬁve repetitions of a ten-fold cross-validation process. Each data set is divided randomly into ten folds ﬁve times. For every division, we repeat the following process: every fold is used exactly one time as the validation set, while the other nine are used as the training set (i.e., ten iterations for every repetition of the ten-fold cross-validation process) (see James et al., 2013, p.181).
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models or types of machines used for the experiments. It only mentions using Python and scikit-learn implementations.
Software Dependencies	No	The paper mentions using 'Python' and 'scikit-learn implementations' but does not specify any version numbers for these software dependencies, which is required for reproducibility.
Experiment Setup	No	The paper describes general aspects of the model building ('built greedily based on the CART empirical MSE criterion, using the Breiman s method') and mentions that 'The parameters of the regressor were tuned in a ﬁve folds inner c.v. for each training set. The parameter grids are reported in the anonymized repository.' However, it does not explicitly provide specific hyperparameter values or detailed training configurations in the main text of the paper.