CAB: Continuous Adaptive Blending for Policy Evaluation and Learning
Authors: Yi Su, Lequn Wang, Michele Santacatterina, Thorsten Joachims
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically examine the evaluation accuracy and learning performance of CAB in two different partial-information settings. |
| Researcher Affiliation | Academia | 1Cornell University, Ithaca, USA 2Cornell TRIPODS Center for Data Science, Ithaca, USA. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the described methodology. |
| Open Datasets | Yes | For the BLBF setting... for several multiclass classification datasets from the UCI repository (Asuncion & Newman, 2007). In the LTR setting... on the YAHOO! LTR Challenge corpus (set 1)... |
| Dataset Splits | Yes | For the LTR setting, we follow the experiment setup of Joachims et al. (2017) and conduct experiments on the YAHOO! LTR Challenge corpus (set 1), which comes with a train/validation/test split. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details, such as library names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | For the BLBF learning experiments, we use POEM (Swaminathan & Joachims, 2015a) to learn stochastic linear policies. For LTR, Appendix C.2 derives a generalized version of propensity SVM-Rank (Joachims et al., 2017)... To avoid biases from the regression model, we adopt 90 percentile c IPS to conduct hyperparameter selection for M(or τ for SB) and regularization parameter on the partial feedback data simulated from the validation set. The experiments are run for 10 and 5 times on BLBF and LTR respectively and the average is reported. Details are shown in Appendix C. |