The Self-Normalized Estimator for Counterfactual Learning

Authors: Adith Swaminathan, Thorsten Joachims

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the empirical effectiveness of Norm POEM on several multi-label classification problems, finding that it consistently outperforms the conventional estimator.
Researcher Affiliation Academia Adith Swaminathan Department of Computer Science Cornell University adith@cs.cornell.edu Thorsten Joachims Department of Computer Science Cornell University tj@cs.cornell.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Software implementing Norm-POEM is available at http://www.cs.cornell.edu/~adith/POEM.
Open Datasets Yes The experiment setup uses supervised datasets for multi-label classification from the Lib SVM repository. In these datasets, the inputs x Rp. The predictions y {0, 1}q are bitvectors indicating the labels assigned to x. The datasets have a range of features p, labels q and instances n: Name p(# features) q(# labels) ntrain ntest Scene 294 6 1211 1196 Yeast 103 14 1500 917 TMC 30438 22 21519 7077 LYRL 47236 4 23149 781265
Dataset Splits Yes Hyper-parameters λ, M were calibrated as recommended and validated on a 25% hold-out of D in summary, our experimental setup is identical to POEM [1].
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models or specific computational resources.
Software Dependencies No The paper mentions that 'CRF is implemented by scikit-learn [27]', but it does not specify the version number of scikit-learn or any other software dependencies.
Experiment Setup Yes Hyper-parameters λ, M were calibrated as recommended and validated on a 25% hold-out of D in summary, our experimental setup is identical to POEM [1]. and To simulate a bandit feedback dataset D, we use a CRF with default hyper-parameters trained on 5% of the supervised dataset as h0, and replay the training data 4 times and collect sampled labels from h0. and Since the choice of optimization method could be a confounder, we use L-BFGS for all methods and experiments.