Ranking Data with Continuous Labels through Oriented Recursive Partitions

Authors: Stéphan Clémençon, Mastane Achab

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Preliminary numerical experiments highlight the difference in nature between regression and continuous ranking and provide strong empirical evidence of the performance of empirical optimizers of the criteria proposed. Numerical experiments providing strong empirical evidence of the relevance of the approach promoted in this paper are also presented. The results are presented in Table 1.
Researcher Affiliation Academia Stephan Cl emenc on Mastane Achab LTCI, T el ecom Paris Tech, Universit e Paris-Saclay 75013 Paris, France first.last@telecom-paristech.fr
Pseudocode Yes THE CRANK ALGORITHM 1. Input. Training data Dn, depth J 1, binary classification algorithm A. 2. Initialization. Set C0,0 = X. 3. Iterations. For j = 0, . . . , J 1 and k = 0, . . . , 2J 1, (a) Compute a median yj,k of the dataset {Y1, . . . , , Yn} Cj,k and assign the binary label Zi = 2I{Yi > yj,k} 1 to any data point i lying in Cj,k, i.e. such that Xi Cj,k. (b) Solve the binary classification problem related to the input space Cj,k and the training set {(Xi, Yi) : 1 i n, Xi Cj,k}, producing a classifier gj,k : Cj,k { 1, +1}. (c) Set Cj+1,2k = {x Cj,k, gj,k = +1} = Cj,k \ Cj+1,2k+1. 4. Output. Ranking tree TJ = {Cj,k : 0 j J, 0 k < D}.
Open Source Code No The paper states: "Issues related to the implementation of the CRANK algorithm and variants (e.g. exploiting randomization/aggregation) will be investigated in a forthcoming paper." This indicates that the code is not currently available.
Open Datasets No The paper describes a "toy example" with a synthetic setup: "The experimental setting is composed of a unidimensional feature space X = [0, 1] (for visualization reasons) and a simple regression model without any noise: Y = m(X)." No public dataset is cited or linked.
Dataset Splits No The paper mentions "cross-validation" in the context of tree pruning: "a subtree among the sequence T0 . . . TJ with nearly maximal IAUC should be chosen using cross-validation." However, it does not specify concrete train/validation/test splits or percentages for the numerical experiments presented in Table 1.
Hardware Specification No The paper does not provide any specific details regarding the hardware used for running the experiments (e.g., GPU models, CPU types, memory specifications).
Software Dependencies No The paper mentions comparing CRANK with CART, but it does not specify the versions of any programming languages, libraries, or frameworks used for implementation (e.g., Python, PyTorch, Scikit-learn versions).
Experiment Setup No The paper describes the 'experimental setting' as a 'unidimensional feature space X = [0, 1]... and a simple regression model without any noise: Y = m(X).' It also mentions tree depth (J) is chosen such that 2^J <= n. However, it does not provide specific hyperparameters like learning rates, batch sizes, or other detailed training configurations for CRANK, KENDALL, or CART.