Extreme F-measure Maximization using Sparse Probability Estimates

Authors: Kalina Jasinska, Krzysztof Dembczynski, Robert Busa-Fekete, Karlson Pfannschmidt, Timo Klerx, Eyke Hullermeier

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6. Experiments We carried out two sets of experiments. In the first, we verify the effectiveness of PLTs in handling a large number of labels by comparing its performance to that of FASTXML in terms of Precision@K. In the second experiment, we combine PLTs and FASTXML with the threshold tuning methods, namely with FTA, STO and OFO, for maximizing the macro F-measure. In both experiments we used six large-scale datasets taken from the Extreme Classification Repository2 with predefined train/test splits (see main statistics of these datasets in Table 1).
Researcher Affiliation Academia Kalina Jasinska KJASINSKA@CS.PUT.POZNAN.PL Krzysztof Dembczy nski KDEMBCZYNSKI@CS.PUT.POZNAN.PL Institute of Computing Science, Pozna n University of Technology, 60-965 Pozna n, Poland R obert Busa-Fekete BUSAROBI@GMAIL.COM Karlson Pfannschmidt KIUDEE@MAIL.UPB.DE Timo Klerx TIMOK@UPB.DE Eyke H ullermeier EYKE@UPB.DE Department of Computer Science, Paderborn University, 33098 Paderborn, Germany
Pseudocode Yes Algorithm 1 STO(Dn, b , ), Algorithm 2 OFO(Dn, b , a, b), Algorithm 3 PLT.TRAIN(T, A, Dn), Algorithm 4 PLT.PREDICT(x, T, Q, )
Open Source Code Yes Get code at https://github.com/busarobi/XMLC.
Open Datasets Yes In both experiments we used six large-scale datasets taken from the Extreme Classification Repository2 with predefined train/test splits (see main statistics of these datasets in Table 1). http://research.microsoft.com/enus/um/people/manik/downloads/XC/XMLRepository.html
Dataset Splits Yes In both experiments we used six large-scale datasets taken from the Extreme Classification Repository2 with predefined train/test splits (see main statistics of these datasets in Table 1). We use 80% of each dataset for training PLTs and FASTXML, and then run FTA, STO and OFO on the remaining 20%. The latter part of the training set is also used to validate the input parameters of the threshold tuning algorithms.
Hardware Specification No The paper mentions "wall-clock test times" but does not specify any hardware details like CPU, GPU models, or memory. For example, "The average per test example wall-clock time and number of inner products are shown in Table 2." does not include hardware specifics.
Software Dependencies No The paper mentions software like "Java", "L2-logistic regression", "SMAC", "C++ code for FASTXML", "GLMNET algorithm", and "Lib Linear package". However, it does not provide specific version numbers for any of these components.
Experiment Setup Yes For the vector , the input parameter of STO and FTA, we first compute the lower bound j of the optimal threshold according to (5), i.e., j = b j/(b j + 1), with b j the prior probability estimate for label j. Then, each element of is set to max(1/c, j), where c 2 C = {10000, 1000, 200, 100, 50, 20, 10, 7, 5, 4, 3, 2}. Similarly, the input parameter b of OFO is tuned over the same set C, while its other input parameter a is constantly set to 1. We additionally carried out experiments for assessing the impact of parameter a (see results in Appendix D), which slightly improves the results. We also control the thresholds in OFO to be greater than the lower bound j.