The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks

Authors: Yingfei Wang, Chu Wang, Warren Powell

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed method on both synthetic datasets and the UCI machine learning repository (Lichman, 2013) which includes classification problems drawn from settings including sonar, glass identification, blood transfusion, survival, breast cancer (wpbc), planning relax and climate model failure. We first analyze the behavior of the KG policy and then compare it to state-of-the-art learning algorithms.
Researcher Affiliation Academia Yingfei Wang YINGFEI@CS.PRINCETON.EDU Department of Computer Science, Princeton University, Princeton, NJ 08540 Chu Wang CHUW@MATH.PRINCETON.EDU The Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ 08544 Warren Powell POWELL@PRINCETON.EDU Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544
Pseudocode Yes Algorithm 1 Online Bayesian Linear Classification; Algorithm 2 Knowledge Gradient Policy under online Bayesian Logistic Regression
Open Source Code No The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We evaluate the proposed method on both synthetic datasets and the UCI machine learning repository (Lichman, 2013) which includes classification problems drawn from settings including sonar, glass identification, blood transfusion, survival, breast cancer (wpbc), planning relax and climate model failure.
Dataset Splits No The paper does not provide specific details regarding training, validation, and test dataset splits. For UCI datasets, it states, 'we use all the data points as the set of alternatives,' but no explicit split information is provided.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions implementing methods like 'online Bayesian logistic regression' but does not specify any software dependencies with version numbers (e.g., Python 3.x, TensorFlow x.x, PyTorch x.x).
Experiment Setup Yes Input: Regularization parameter λ > 0 mj = 0, qj = λ. (Each weight wj has an independent prior N(mj, q 1 j )). for t = 1 to T do Get a new point (x, y).; on synthetic datasets, we randomly generate a set of M d-dimensional alternatives x from [ 3, 3]. At each run, the stochastic binary labels are simulated using a d + 1-dimensional weight vector w which is sampled from the prior distribution w i N(0, λ).; We allow each algorithm to sequentially measure N = 30 alternatives.