reproducibilityindex.ai

Extreme F-measure Maximization using Sparse Probability Estimates

Authors: Kalina Jasinska, Krzysztof Dembczynski, Robert Busa-Fekete, Karlson Pfannschmidt, Timo Klerx, Eyke Hullermeier

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6. Experiments We carried out two sets of experiments. In the ﬁrst, we verify the effectiveness of PLTs in handling a large number of labels by comparing its performance to that of FASTXML in terms of Precision@K. In the second experiment, we combine PLTs and FASTXML with the threshold tuning methods, namely with FTA, STO and OFO, for maximizing the macro F-measure. In both experiments we used six large-scale datasets taken from the Extreme Classiﬁcation Repository2 with predeﬁned train/test splits (see main statistics of these datasets in Table 1).
Researcher Affiliation	Academia	Kalina Jasinska KJASINSKA@CS.PUT.POZNAN.PL Krzysztof Dembczy nski KDEMBCZYNSKI@CS.PUT.POZNAN.PL Institute of Computing Science, Pozna n University of Technology, 60-965 Pozna n, Poland R obert Busa-Fekete BUSAROBI@GMAIL.COM Karlson Pfannschmidt KIUDEE@MAIL.UPB.DE Timo Klerx TIMOK@UPB.DE Eyke H ullermeier EYKE@UPB.DE Department of Computer Science, Paderborn University, 33098 Paderborn, Germany
Pseudocode	Yes	Algorithm 1 STO(Dn, b , ), Algorithm 2 OFO(Dn, b , a, b), Algorithm 3 PLT.TRAIN(T, A, Dn), Algorithm 4 PLT.PREDICT(x, T, Q, )
Open Source Code	Yes	Get code at https://github.com/busarobi/XMLC.
Open Datasets	Yes	In both experiments we used six large-scale datasets taken from the Extreme Classiﬁcation Repository2 with predeﬁned train/test splits (see main statistics of these datasets in Table 1). http://research.microsoft.com/enus/um/people/manik/downloads/XC/XMLRepository.html
Dataset Splits	Yes	In both experiments we used six large-scale datasets taken from the Extreme Classiﬁcation Repository2 with predeﬁned train/test splits (see main statistics of these datasets in Table 1). We use 80% of each dataset for training PLTs and FASTXML, and then run FTA, STO and OFO on the remaining 20%. The latter part of the training set is also used to validate the input parameters of the threshold tuning algorithms.
Hardware Specification	No	The paper mentions "wall-clock test times" but does not specify any hardware details like CPU, GPU models, or memory. For example, "The average per test example wall-clock time and number of inner products are shown in Table 2." does not include hardware specifics.
Software Dependencies	No	The paper mentions software like "Java", "L2-logistic regression", "SMAC", "C++ code for FASTXML", "GLMNET algorithm", and "Lib Linear package". However, it does not provide specific version numbers for any of these components.
Experiment Setup	Yes	For the vector , the input parameter of STO and FTA, we ﬁrst compute the lower bound j of the optimal threshold according to (5), i.e., j = b j/(b j + 1), with b j the prior probability estimate for label j. Then, each element of is set to max(1/c, j), where c 2 C = {10000, 1000, 200, 100, 50, 20, 10, 7, 5, 4, 3, 2}. Similarly, the input parameter b of OFO is tuned over the same set C, while its other input parameter a is constantly set to 1. We additionally carried out experiments for assessing the impact of parameter a (see results in Appendix D), which slightly improves the results. We also control the thresholds in OFO to be greater than the lower bound j.