High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization
Authors: Qing Feng , Ben Letham, Hongzi Mao, Eytan Bakshy
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We use a collection of simulation studies to characterize the performance and robustness of the models, and show that our approach of inferring a low-dimensional context embedding performs best. Finally, we show successful contextual policy optimization in a real-world video bitrate policy problem. |
| Researcher Affiliation | Collaboration | Facebook qingfeng@fb.com Benjamin Letham Facebook bletham@fb.com MIT hongzi@mit.edu Eytan Bakshy Facebook ebakshy@fb.com |
| Pseudocode | No | The paper describes its models and methods but does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code for the models and replication materials are available at https://github.com/facebookresearch/ContextualBO. |
| Open Datasets | No | The paper mentions using 'de-identiļ¬ed trace data from the Facebook Android mobile app' and 'synthetic problems based on the Hartmann6 test function', but it does not provide concrete access information (e.g., links, DOIs, formal citations) for a publicly available dataset. |
| Dataset Splits | No | The paper does not provide specific training, validation, or test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) for reproducibility. |
| Hardware Specification | No | The paper does not specify any particular hardware used for running experiments, such as CPU or GPU models, memory, or cloud computing instance types. |
| Software Dependencies | No | The paper mentions the 'Park platform' and 'Bo Torch' framework, but it does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | No | The paper describes the general setup of the contextual policy (e.g., 48-dimensional vector, 12 contexts) and the acquisition function used (EI), but it does not provide specific hyperparameter values or detailed system-level training settings for its models. |