High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization

Authors: Qing Feng , Ben Letham, Hongzi Mao, Eytan Bakshy

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We use a collection of simulation studies to characterize the performance and robustness of the models, and show that our approach of inferring a low-dimensional context embedding performs best. Finally, we show successful contextual policy optimization in a real-world video bitrate policy problem.
Researcher Affiliation Collaboration Facebook qingfeng@fb.com Benjamin Letham Facebook bletham@fb.com MIT hongzi@mit.edu Eytan Bakshy Facebook ebakshy@fb.com
Pseudocode No The paper describes its models and methods but does not include any pseudocode or algorithm blocks.
Open Source Code Yes Code for the models and replication materials are available at https://github.com/facebookresearch/ContextualBO.
Open Datasets No The paper mentions using 'de-identified trace data from the Facebook Android mobile app' and 'synthetic problems based on the Hartmann6 test function', but it does not provide concrete access information (e.g., links, DOIs, formal citations) for a publicly available dataset.
Dataset Splits No The paper does not provide specific training, validation, or test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) for reproducibility.
Hardware Specification No The paper does not specify any particular hardware used for running experiments, such as CPU or GPU models, memory, or cloud computing instance types.
Software Dependencies No The paper mentions the 'Park platform' and 'Bo Torch' framework, but it does not provide specific version numbers for these or any other software dependencies.
Experiment Setup No The paper describes the general setup of the contextual policy (e.g., 48-dimensional vector, 12 contexts) and the acquisition function used (EI), but it does not provide specific hyperparameter values or detailed system-level training settings for its models.